Closed fdetsch closed 4 years ago
To answer those question we might need some help from the service provider as the client software is merely a tool to interact with the back-ends.
Regarding your first question: I'm not sure how the GEE driver serializes the data at the end. My guess is that it can only serialize 2D data, where the dimensions are the spatial ones. @m-mohr might have some more insides here as he knows probably best, how the GEE data cube is serialized at the end. From a client perspective the only workaround would really be the loop approach you suggested.
About the second question: this is also probably up to @m-mohr and @jdries to discuss and elaborate. In general you can up- or downscale the bands in order to make them compatible for further analysis - meaning you need them to have the same resolution. If GEE shows a higher resolution I would assume that the lower resolution band is resampled via a nearest neighbor approach. Since the bands of S2 are aligned you would basically split up one pixel in band 12 into 4 pixel and assign them the each the value of the source pixel. But this is just a guess, I'm not exactly sure how Google Earth Engine handles those conflicts internally. Also I don't know how VITO handles those conflicts.
Some tips from Terrascope side:
The 'SAR processing' example in this notebook: https://github.com/Open-EO/openeo-usecases/blob/master/final_user_workshop/openeo_finaluserworkshop_livedemo.ipynb shows my attempt of running something on Terrascope and GEE.
This is how to change projection and resolution in GEE output: sar_difference(gee,collection_gee,to_decibel=False).download("sar-sigma0-rgb-scaled-gee.tiff",format='GTIFF-THUMB',options={'size':2500,'epsgCode':32631})
Thanks both of you for sharing your thoughts on this.
@jdries I wasn't aware that omitting bands
when loading the collection caused such an overhead on the backend side even if subsequent calculations use two bands from that collection only – thanks for the hint.
In fact, it seems like GEE returns a higher spatial resolution than VITO by default (not sure how they achieve that, though; I would assume bilinear interpolation, nearest neighbor or the like). At least, when I reproject the GEE result layer from latlon to EPSG:32631, I get
prj2.2 = projectRaster(rst2.2, crs = "+init=epsg:32631")
res(prj2.2)
# 3.43 5.56 (x, y)
Sentinel-2 layers from VITO have the regular 10 m spatial resolution.
Any confirmation on the 2D serialization that @flahn mentioned would still be highly appreciated.
Interesting topic! Here are some thought from my side as developer of the GEE back-end (although I have no deeper insights into GEE) and the person who was mainly working on the process specifications.
First of all, the way you are using apply is not specification compliant. I'm not sure why this works with VITO (@jdries), but what you should use here is reduce_dimension, as you already mentioned. apply works on a per-pixel basis and you can't access labels there as you do (e.g. x["B8"]
).
spectral_reduce2.1 = p2$apply(dat2, function(x, context) { p2$normalized_difference(x["B8"], x["B12"]) })
The error message GEE gives:
SERVER-ERROR: Server error: Dimension 'undefined' does not exist.
is not ideal and should be improved nevertheless.
Regarding the questions I think @flahn already is on the right track.
Currently, the GEE back-end basically creates a mosaic of the remaining data to get a single image (2D with bands) as we can only call the getDownloadUrl/getThumbUrl GEE API calls on single images. Requesting the logs should mention that as a warning: "Compositing the image collection to a single image.". We'd likely get better results if we'd directly run batch jobs on GEE, but that is much more complex and we never had time to implement that. The GEE implementation is only a proof of concept and can't be "completed" with the limited funding we have.
I'm also not 100% sure how GEE works internally, but with the methods I have at hand for downloading files, I don't think I can influence that (reliably).
That's all things I would like to improve, but that would need support from Google and quite a lot of additional time (i.e. funding).
@m-mohr Thanks for sharing these insights. Can you elaborate a bit on the compositing procedure, ie. which function is applied (mean
, max
, etc.) to create the composite? Or is it simply the first/last image in the collection stack?
@fdetsch I'm using GEE's mosaic function for Image Collections: https://developers.google.com/earth-engine/apidocs/ee-imagecollection-mosaic As the documentation is rather short, I also have no clue what GEE is doing... I've seen that being used in examples and used it, too, but maybe there's a better way of handling this case. I'm always open for suggestions.
I just found some information in a GEE guide:
The mosaic() method composites overlapping images according to their order in the collection (last on top). To control the source of pixels in a mosaic (or a composite), use image masks.
I'm not using a mask, so it's the last image. Maybe I should use qualityMosaic() instead of mosaic() for (better?) results? Long term there should be no mosaicing, but as I mentioned above... funding and so on.
@jdries (or any of the others) Any idea how to adjust GEE output CRS and resolution in R? I tried to specify options
via ...
in compute_result()
based on your Python code, but had no success so far.
I think @flahn can better answer R related questions... :-)
@fdetsch I'd expect that you can set it in save_result, something like:
r1 = p1$save_result(
data = spectral_reduce1,
format = "GTIFF-THUMB",
options = list(size=2500,epsgCode=32631)
)
I'm not a R programmer, the syntax might slightly differ...
That was helpful! Image reprojection via epsgCode
works like a charm (syntax seems to be correct ;-)).
Unfortunately, size
determines the dimensions, ie. number of rows and cols, of the resulting image. Any idea if there is a parameter that controls image resolution, ie. pixel size? I tried res
and resolution
, but had no success so far. GEE's default resolution when projecting to UTM seems to be
resolution : 56.11531, 56.11531 (x, y)
, ie. much coarser than the expected 10 and 20 m for bands 8 and 12, respectively. This likely confirms the trade-off between file size and performance at GEE mentioned by @jdries in the beginning.
Do you know if such customization options accepted by save_result(options = list(...))
(for any supported client) are documented somewhere?
So from what I have found out so far, scale
seems to be controlling pixel size on the backend side. Thing is, while epsgCode
is taken into account, scale
seems to be ignored both in standalone mode (ie. being the only option) and in combination with epsgCode
🤔
@fdetsch list_file_formats() gives you the options you can pass for the file formats to save_result, see an example here.
The size option is basically just passed through to https://developers.google.com/earth-engine/apidocs/ee-image-getdownloadurl and https://developers.google.com/earth-engine/apidocs/ee-image-getthumburl, depending on the file format. As you said, size corresponds to the dimensions in GEE.
We have not implemented scale yet. It only works for GTIFF-ZIP and also only works in GEE if crs is not set. We can implement it though, if there's demand for it.
Got it, thanks.
Just noticed that I was seemingly using a wrong UTM zone for reprojection although it's convention to use this zone for the area I am working with. Using the neighboring zone, I am down to a resolution of
resolution : 1.927902, 1.927902 (x, y)
(meters), so not losing any information. Resampling to 10 m needs to happen in R, then. I would definitely stand in line for a GEE based implementation, if only it weren't for the "CRS not set" limitation. Setting the target resolution in latlon native units seems a little cumbersome.
@m-mohr I consider this issue closed from my point of view. I'll leave it up to you to close it or not since you included a couple of things above that you are envisioning for the future.
I've now created issues in the GEE repository to track all remaining improvements. So closing this for now.
I am trying to retrieve raster time series of NDWI for a given area of interest and time period. My code runs smoothly when using the VITO backend:
Unfortunately, my area of interest lies in the United States, for which VITO doesn't seem to provide Sentinel-2 data. At least, I get
from
compute_result()
when specifying a bounding box located in the US.Therefore, I modified the above code to use Google's Earth Engine instead of VITO. Here, the
apply()
based approach to get raster time series for the same target extent as above no longer works:Using a
reduce_dimension()
based approach instead, the program finishes successfully, but the resulting GeoTIFF doesn't have the expected four layers (ie. four overflights) and, in addition, has a different spatial resolution:My question here is two-fold: