gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

Ingestion of a dataset failed at VERBATIM_TO_INTERPRETED #5373

Closed matdillen closed 4 months ago

matdillen commented 4 months ago

I published this dataset on June 10. It was picked up for ingestion the same day and everything seemed to be going smoothly.

Today I was alerted that data still had not shown up on the dataset page. Looking into the history in the registry, the process stopped at the VERBATIM_TO_INTERPRETED step. Any future attempts to recrawl the dataset were aborted because nothing had changed at the side of the IPT.

Today, I published a new version (with no changes) and forced a new ingestion. This ingestion went through with no problems and the data is available now on GBIF.

Any idea what the problem was on June 10, or possibly even with the dataset published at that time? I'm pretty sure there should not be any changes between the version on June 10 and the one I just published (but I don't know how I could verify this now).

CecSve commented 4 months ago

When I look at the error logs for that ingestion attempt&_a=(columns:!(_source),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'439da4d0-290a-11ed-8155-a37cb1ead50e',key:level,negate:!f,params:(query:ERROR),type:phrase),query:(match_phrase:(level:ERROR)))),index:'439da4d0-290a-11ed-8155-a37cb1ead50e',interval:auto,query:(language:lucene,query:'datasetKey.keyword:%22747d8c01-0537-4979-ada7-dad9bfa0d59d%22%20AND%20attempt:%221%22'),sort:!('@timestamp',desc))) I can see the following error messages:

It seems to me that the IDs were missing in that attempt, but maybe you can explain @muttcg?

CecSve commented 4 months ago

Hi @matdillen - it seems it was a bug in our system so nothing to do with the dataset.