Closed MagicForrest closed 6 years ago
If I understand correctly, all that is required in order to solve this issue is to leave both the input CRS (ie. MODIS Sinusoidal) and pixel size of MOD44B untouched.
I just modified getTile()
in such a way that, whenever a country name (passed to maps::map()
behind the curtain) is specified as extent
, the target CRS of all layers extracted through runGdal()
is adopted from MODISoptions() instead of the resulting 'map' object (ie. EPSG:4326), see also 0892cc8bce8f475842dd3981dcc3725752652294. Something like
## devtools::install_github("MatMatt/MODIS", ref = "develop")
library(MODIS)
MODISoptions(outProj = "asIn", pixelSize = "asIn") # default settings
tfs = runGdal("MOD44B", tileH = 21, tileV = 8
, begin = "2000001", end = "2000366"
, SDSstring = "1000000", job = "mod44b-test")
## check raster
rst = raster(unlist(tfs))
projection(rst)
# [1] "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs"
## count values
tbl = table(rst[])
rev(tbl)[1:5]
# 200 88 87 86 85
# 473186 2 23 111 372
now seems to generate the desired outcome. Note that when working with 'Raster' or 'Spatial' inputs instead of country names, the target CRS will be adopted from the input object. In this case, the object needs to be available in MODIS Sinusoidal ("+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs") in order to suppress reprojection, and hence interpolation issues affecting extracted layers. Here's some sample code for this second approach:
shp = getData(country = "TUR", level = 0, path = tmpDir())
shp = spTransform(shp, CRS("+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs"))
tfs = runGdal(product = "MOD44B", extent = shp
, begin = "2015001", end = "2015365"
, SDSstring = "1000000", job = "mod44b-test")
Hmmm... I can't really comment on the first part because I don't use the package in that way (either specifying tiles or country extents). The second point is more relevant to me, but again not entirely, because it doesn't do any include any interpolation which is very important for my workflow.
My basic task is to compare MODIS data to my simulation outputs. So, for this purpose I want to put the MODIS data exactly on simulation grid (as defined as an R raster). Therefore my target grid (and CRS) is fixed. I appreciate that I could probably transform my target grid to MODIS sinusoidal, do the runGdal(), and then transform back. But that is time consuming, will introduce error and is just generally clumsy.
Maybe I am asking for a lot, but what I really need is to be able to interpolate and project the MODIS data on to any arbitrary grid (assuming it can be defined on an R raster). And it needs to handle MODIS 'out-of-range' values sensibly, ie not just blindly include them in whatever interpolation needs to be done. Maybe this is asking too much for the aims of your package?
But it seems to me that such functionality is almost there, it is just that GDAL is blind to MODIS's 'water' and 'fill' values. If they could be masked out when the GDAL command is run (is assume it is a gdalwarp command in the background which is why I suggested a particular argument) then the holy grail would be achieved!
Does this all make sense? It is hard to explain. We can also chat on the phone if you want.
Hello again. I realise I was a bit demanding and under-appreciative of your last response. Sorry about that.
So I thought more carefully about it, and came to the conclusion that your solution would allow me to download and mosaic the data using your MODIS package which is still useful. But then I would need to re-project and and resample it myself (after changing the water and fill values to NA) to get the data I need.
I then tried to run the example code you posted, but I got an "NA" for the raster projection and different values for counts. Perhaps because of this error message:
Warning messages: 1: In .getCRSfromGridMap4(atts) : cannot process these parts of the CRS: long_name=CRS definition; spatial_ref=PROJCS["unnamed",GEOGCS["unnamed ellipse",DATUM["unknown",SPHEROID["unnamed",6371007.181,0]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433]],PROJECTION["Sinusoidal"],PARAMETER["longitude_of_center",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["Meter",1]]; GeoTransform=3285172.47849608 231.6580305865238 0 1111950.51976652 0 -231.6563582846917 2: In .getCRSfromGridMap4(atts) : cannot create a valid CRS grid_mapping_name; false_easting; false_northing; scale_factor_at_projection_origin; scale_factor_at_central_meridian; standard_parallel; standard_parallel1; standard_parallel2; longitude_of_central_meridian; longitude_of_projection_origin; latitude_of_projection_origin; straight_vertical_longitude_from_pole; longitude_of_prime_meridian; semi_major_axis; inverse_flattening; +proj; +x_0; +y_0; +k_0; +k_0; +lat_1; +lat_1; +lat_2; +lon_0; +lon_0; +lat_0; +lon_0; +lon_0; +a; +rf
Merged your pull request with 5533ca57f9348f2c8ee9ea875cad26eb3a63c72e, sorry for the delay. 'maskValue' works like a charm, thanks for that. Until the next CRAN release, this'll be available from 'develop' only. Note that I added your name to the list of contributors in DESCRIPTION and ?runGdal
, hopefully that's okay for you.
I'll leave this thread open until we find a solution for passing multiple mask values on to GDAL.
Multiple No Data Values are now available from the 'develop' branch. @MagicForrest, could you please try out the following:
# devtools::install_github("MatMatt/MODIS", ref = "develop")
library(MODIS)
lai = runGdal("MCD15A2H", tileH = 21, tileV = 8
, begin = "2003001", end = "2003008"
, SDSstring = "010000", job = "mcd15a2h-test"
, maskValue = 249:254, overwrite = TRUE)
rst = raster(lai[[1]][[1]])
(tbl <- table(rst[]))[order(as.numeric(names(tbl)), decreasing = TRUE)][1:10]
# 70 69 68 67 66 65 64 63 62 61
# 655 491 851 4107 11045 5398 3704 3555 13975 456
In MCD15A2H, there is a pre-defined No Data Value (255) that is automatically detected by runGdal()
plus a couple of fill values (249 to 254) that need to be specified by the user. Note that in the above code chunk, we could also define maskValue = 249:255
- the function would automatically detect that 255 is the 'official' No Data Value, and hence remove it from 'maskValue' to reduce computation time.
For the record, there definitely are fill values in the raw image:
lai_ref = runGdal("MCD15A2H", tileH = 21, tileV = 8
, begin = "2003001", end = "2003008"
, SDSstring = "010000", job = "mcd15a2h-test"
, overwrite = TRUE)
plot(rst_ref <- raster(lai_ref[[1]][[1]]))
(tbl_ref <- table(rst_ref[]))[order(as.numeric(names(tbl_ref)), decreasing = TRUE)][1:10]
# 254 253 250 70 69 68 67 66 65 64
# 117370 83443 4426 655 491 851 4107 11045 5398 3704
Thanks for picking up the pull request and extending it to multiple no data values :-)
With GTiff it worked like charm! So I would say that it has been 'linux tested'.
However (and this is probably a different issue) with netCDF I got errors about lzw compression and also about the sinusoidal projection. But I still get this issue with the master branch, so it must be unrelated. I'm not sure if it is from my GDAL/netCDF/HDF/(zlib?) software stack, or actually with the MODIS package. I'll look into it some day.
Thanks again! Also, you don't need to list me as a contributor since my contribution has been so minor. I wouldn't want to 'dilute' the credit due to more worthy contributors.
Any updates on the netCDF part? I'll probably test it on my machine later this day to see if this is somehow related to MODIS.
Closing this issue since multiple mask values are now available. Moved the netCDF discussion to a new issue #52.
Thanks again for the great package, I really like it. But unfortunately I have found an issue that might really hinder its usefulness, at least for me. Of course I might have missed something....
When using GDAL on product MOD44B (and this might well affect other products) the codes for water (200) and fill (253) seem to be included in the interpolation as if they were real data. This means that some larger aggregated pixels have values of greater that 100% for tree cover (or bare cover or whatever). This is not ideal, but they can be trimmed out relatively easily (at the loss of some pixels from the aggregated data) . You can see this in the attached screen shot. Note the scale in ncview, all the pixels that are not blue are over 100. This domain in question is Turkey by the way. You can probably just about make out the coastline.
More concerning is the fact that this indicates that some pixels may have apparently valid values (in the range 0-100) but actually they include contributions of 200 or 253. This is impossible to identify.
I have tried with both "average" and "bilinear" interpolation and I see that same thing.
Obviously this is kind of a problem. As far as I can tell it needs to be solved in the MODIS package by masking out the water and fill pixels at the GDAL command stage. This should be possible using an option, I think. Maybe the -srcnodata option perhaps.