gbif / pipelines

Pipelines for data processing (GBIF and LivingAtlases)
Apache License 2.0
40 stars 28 forks source link

ingestion management - reword "cause" message in GitHub issue creation to be more interpretable #979

Open jhnwllr opened 11 months ago

jhnwllr commented 11 months ago

@marcos-lg @muttcg

Cause: GBIF ID problems exceed 50% threshold: 100% duplicates; 7020 total records; 7020 absent records

The current cause statement for ingestion management issues is a bit hard to interpret and needs to be rewritten.

Would be nice to have more clear language, so that we can eventually send these issues directly to publishers.

A suggestion (open for discussion) :

Total occ records in current version on GBIF : 1000
Total occ records in new publisher version : 1500
Number of new occurrenceIds between versions : 1000
ahahn-gbif commented 11 months ago

Thinking out loud: it would be great if this could also apply to the hover-over metrics in the crawling steps report (registry), to make them more easily understandable. Currently: Screenshot 2023-11-02 at 09 59 11