Open-EO / openeo-earthengine-driver

openEO back-end driver for Google Earth Engine.
Apache License 2.0
21 stars 7 forks source link

List of data sets #1

Closed m-mohr closed 5 years ago

m-mohr commented 6 years ago

Currently, the list of data sets is hard coded with just one entry (Copernicus/S2). This is due to the fact that I don't know a way to get a list of data sets for parsing.

In the best case it includes information like the bands, name, id, description, source/copyright, etc. I don't really want to parse it from the HTML pages (https://explorer.earthengine.google.com) as this is pretty messy. I found that there is an endpoint https://code.earthengine.google.com/rasters?q=keyword that is used by the GEE Code Editor Playground, but I was not able to get responses from outside Playground. After having a list, we could get information using ee.ImageCollection.getInfo(), I assume.

Asked for help at gis.stackexchange.com and contacted Noel Gorelick.

m-mohr commented 6 years ago

According to Noel there is no "stand-alone endpoint for dataset discovery". It's on their ToDo list and might be available during the project lifetime. Google also offered to dump data into their Cloud storage.

For now I used a dump of their data sets as basis for GET /data and get some additional date from ee.ImageCollection.getInfo() / ee.Image.getInfo() for GET /Data/{id}. Implemented with https://github.com/Open-EO/openeo-earthengine-driver/commit/548e3da4aac36b5698f8e0ab3d26efd6a1ce15cd

m-mohr commented 6 years ago

Participants at the hackathon got confused by the GEE back-end as some data is guessed from a single image (band information - HH and HV were available, but VV was missing). We need to support proper information about data sets, but that's something Google needs to deliver. Maybe it is not a good idea to guess data, but leave it out completely.

tylere commented 6 years ago

Each of the datasets in the Earth Engine data catalog has a data description page that gives a short overview. Is that sufficient? https://earthengine.google.com/datasets/

On Mon, Jun 25, 2018, 08:32 Matthias Mohr notifications@github.com wrote:

Participants at the hackathon got confused by the GEE back-end as some data is guessed from a single image (band information). We need to support proper information about data sets, but that's something Google needs to deliver. Maybe it is not a good idea to guess data, but leave it out completely.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Open-EO/openeo-earthengine-driver/issues/1#issuecomment-399952976, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFqviITQn1VfKCbCJYSw2yjHZ9fKPPtks5uAOZ9gaJpZM4S07CX .

m-mohr commented 6 years ago

Thanks, @tylere. Unfortunately, that is not very handy for machine-based communication. I actually already have a JSON file containing all your data sets from Noel, but I can't reliably extract the band information from the descriptions without too much work. Of course, as a user, I could get the information from there for sure, but it would be better to have it in a machine-readable way so that our clients can handle these information directly and deliver them nicely to a user. Until Google has some machine-readable dataset information - I heard your are already working on it - I may just remove the band information altogether and just add a link to the dataset information page, so that users can get the information manually.

tylere commented 6 years ago

Makes sense. Yes, we are working on a machine readable version.

On Mon, Jun 25, 2018, 08:48 Matthias Mohr notifications@github.com wrote:

Thanks, @tylere https://github.com/tylere. Unfortunately, that is not very handy for machine-based communication. I actually already have a JSON file containing all your data sets from Noel, but I can't reliably extract the band information from the descriptions without too much work. Of course, as a user, I could get the information from there for sure, but it would be better to have it in a machine-readable way so that our clients can handle these information directly and deliver them nicely to a user. Until Google has some machine-readable dataset information - I heard your are already working on it - I may just remove the band information altogether and just add a link to the dataset information page, so that users can get the information manually.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Open-EO/openeo-earthengine-driver/issues/1#issuecomment-399957843, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFqvmcaxMkNGT4NDFcptbMvyBkWBhMZks5uAOojgaJpZM4S07CX .

m-mohr commented 5 years ago

Data is now downloaded once a day from a GCS bucket, which holds STAC compatible data. Closing this issue.