Open-EO / openeo-hub

Source code for openEO Hub, a centralized platform to explore openEO back-end providers.
https://hub.openeo.org
Apache License 2.0
8 stars 3 forks source link

Search for processes / collections broken #76

Closed m-mohr closed 4 years ago

m-mohr commented 4 years ago

Today, I had to search for all back-ends implementing sum, count, divide, neq and if and encountered some issues on the Hub:

  1. No processes (also collections) are available in the autocomplete lists
  2. If I manually type "sum", no back-end is found although at least GEE supports it
  3. I tried to look at the processes (and collections) manually per back-end. Weird behavior: Once I expand GEE, it shows "Collections (440)", but once I click to expand the list, it shows "Collections (0)" and the list is empty.
christophfriedrich commented 4 years ago

Sounds like something major is wrong in the database. ~Can you run the drop script on the server and then crawl again and report whether the problem persists?~

I tracked down the error: The collection "COPERNICUS/S2_SR" is listed twice in http://earthengine.openeo.org/v1.0/collections (index 19 and again 49).

After the actual crawling, the data is processed (the crawl script logs "Processing data..."). Among other things, the collections and processes are extracted from the raw data and saved into dedicated database tables*. But these tables have a unique key on service + api_version + ID... So inserting the result of the extraction pipeline into the collection table fails. -> autocomplete list for collections is empty (because it is derived from this table)

The initial preview of collections is derived from the backends table, but when expanding, the dedicated collections table is used -> weird behaviour happens (because the former is filled but the latter is empty).

In the post-crawl processing, the processes are handled after the collections, so their pipeline didn't get to run because the collections already failed earlier. -> autocomplete list for processes is empty

I don't understand why the search doesn't work though.

Looks like I have to make this processing more robust...


* Note: The Hub uses a MongoDB, so technically it's not a "table" but a "collection", but I find this terminology confusing when also talking about openEO collections in the same context.

christophfriedrich commented 4 years ago

I now included additional pipeline stages into the DB queries which ensure that collection and process IDs are unique.

@m-mohr In principal you could deploy the current state of the dev branch, I'm just not yet saying "DO IT!" because of my confusion regarding the vue-components dependency.

Alternatively cherry-pick the above commit for a hotfix?

m-mohr commented 4 years ago

Deployed, seems to work fine.

m-mohr commented 4 years ago

There's also an issue with the Functionalities Chooser: see #77