fdetsch / MODIS

Download and processing framework for MODIS imagery. The package provides automated access to the global online data archives LP DAAC, LAADS and NSIDC as well as processing capabilities such as file conversion, mosaicking, subsetting and time series.
Other
57 stars 27 forks source link

error sf::gdal_utils("warp") in runGdal #93

Open itati01 opened 4 years ago

itati01 commented 4 years ago

Hi,

I want to download MOD16 and MYD16 data with runGdal but receive an error in sf::gdal_utils using "warp". Actually, I plan to reproject the data but I get this error even without specifying outProj etc.

modis <- runGdal(product = "M.D16A2", begin=modis_dates$beginDOY, end=modis_dates$endDOY, tileH = 18:19, tileV=4, 
#extent = poly_2, pixelSize=500, resamplingType="bilinear", outProj="EPSG:3035", dataFormat="GTiff",
outDirPath=out_path)

The traceback is:

3: stop(paste0("gdal_utils ", util, ": an error occured"))
2: sf::gdal_utils(util = "warp", source = gdalSDS, destination = ofile, 
       options = c(params, "-overwrite", "-multi"), quiet = !is.null(opts$quiet) && 
           opts$quiet)
1: runGdal(product = "M.D16A2", begin = modis_dates$beginDOY, end = modis_dates$endDOY, 
       tileH = 18:19, tileV = 4, outDirPath = out_path)

gdalSDS contained two elements but the error also occured with gdalSDS[[1]]. So, I replaced "warp" with "info" and the function continues until raster is complaining rightly about the (missing) file. But is "warp" actually needed if t_srs == s_srs?

P.S. Sporadically, I get the message "Output driver `83' not recognised or does not support direct output file creation."

fdetsch commented 4 years ago

@itati01 As suggested by @mlampros in #96, can you please

remotes::install_github("MatMatt/MODIS", ref = "develop")

and then try to rerun your code?

itati01 commented 4 years ago

I tested it again under Linux with an absolute outDirPath like /data/modis/output. The function writes MOD16A2.*.tif files but stops with an error.

The output of traceback is:

7: strsplit(x,":")
6: getSdsNames(SDSnames)
5: gsub("\"", "", getSdsNames(SDSnames))
4: getSds(HdfName = y, SDSstring = SDSstring)
3: FUN(X[[i]],
2: lapply(files, function(y){ getSds(HdfName = y, SDSstring = SDSstring)
1: runGdal(product = "M.D16A2", begin = modis_dates$beginDOY, end = dates$endDOY, tileH = 18:19, tileV = 4, outDirPath = out_path)

The output was

Downloading structure on 'LPDAAC' for: MOD16A2.006
Downloading structure on 'LPDAAC' for: MOD16A2GF.006

However, I did not find any *GF* files or folder (which I actually don't need) but I miss MYD16A2 files.

mlampros commented 4 years ago

@itati01,

if you re-project the data using an extent (as you've mentioned in your first comment with 'poly_2') then highly probable the error comes from the second call to sf::gdal_utils() which takes the extent as input and attempts to crop the downloaded .hdf file of the corresponding vertical and horizontal tile. The 'sf::gdal_utils()' is not very verbose when it throws an error.

I've used the

gdalUtils::gdalwarp()

function and I observed in my use case that re-projection caused the initial (sinusoidal projected) area to become very small ( dimensions of 0 x 5 ).

Something else that I observed is that the 'sf::gdal_utils()' function throws an error if the gdalSDS if of length > 1. Normally, this should not be an issue for gdal-warp because the source parameter can take more than 1 files as input (I mentioned it also in issue 98). I was able to overcome this second error by iterating over the 'gdalSDS' vector of files and picking the one with the bigger .hdf size (higher area coverage) using,


file.info(gdalSDS_iter)$size

where 'gdalSDS_iter' is one of the .hdf files.

itati01 commented 4 years ago

@mlampros , But I commented out the line with the extent and did not use it in my second test as well. I don't know the possible error messages of strsplit but could x become a "non-character argument" in getSds?

mlampros commented 4 years ago

@itati0,

to debug your error case and find out if 'x' does not take the expected value, you have to run the internal code of runGdal function line by line and use the triple-dot ( MODIS::: ) for the functions that are not exported (I guess the same did the author of the package as there are a couple of places where the first indices are commented out, here, here or here)

itati01 commented 4 years ago

@mlampros,

I followed your suggestion.

The problem is that my regex pattern not only matches "MOD16A2" and "MYD16A2" as intended but also "MOD16A2GF" which could not be downloaded (authentication fails). However, an empty hdf is created for the last of my three Modis dates (I tested August 2014).

Therefore, length(files) > 0 is TRUE and getSds is executed in the lapply where sf::unlist(sf::gdal_subdatasets(HdfName[1])) results in NULL which is passed to getSdsNames et voilà: "Error in strsplit(x, ":") : non-character argument".

So, it would probably be good to avoid the empty file. On the other hand, skipping invalid files (no subdatasets) could also be desirable. Better to open a new issue?

mlampros commented 4 years ago

@itati01,

if you came to a conclusion what the issue might be based on the data that you download, submitting a PR to fix this issue might be a better solution.

fdetsch commented 4 years ago

@itati01 You can use product = "M.D16A2$" to prevent the GF products from being processed. For regex input, it is intended behavior that all matching products are considered.

Meanwhile, I tested your code for all four products in August 2014 and it finished without issue, which makes it hard for me to assess what went wrong in your particular case.

itati01 commented 4 years ago

@fdetsch Thanks for the looking into my issue. I hardly use MODIS data, so was not aware of the GF product. Therefore, I initially did not use $ and kept this for the second test. Thanks for the hint, anyway.

If the GF data was downloaded during your test, there might indeed be some specific problem here. I assume a positive checkIntegrity before the lapply could avoid errors in case of empty or otherwise corrupt input files in rundGdal. If you like I can try to come up with a PR.

fdetsch commented 4 years ago

Since you're probably the only one who can reproduce this at the moment, I'd highly appreciate a PR. There are integrity checks for downloaded .hdf files in the pkg, which should lead to download retries in case of corrupted files. I am not certain what's going wrong in this prospect..