CatalogueOfLife / checklistbank

UI for checklistbank.org
https://www.checklistbank.org/
7 stars 2 forks source link

Spoiled metadata in fixed releases #1265

Closed yroskov closed 11 months ago

yroskov commented 1 year ago

Some items(*) in GSD metadata in fixed CoL editions are changing to the latest view of metadata in CoL-Working Draft (id 3). That is wrong

(*) at least incorrect Alias

yroskov commented 1 year ago

Example from the 2023 Annual Checklist:

AC23 was published with 3i Curculio GSD (id 9910). Authors of 3i Curculio changed its Short Name to Entiminae a month after AC23 release (in July). But, in the list of AC23 sources (=COL23, id 9910), this GSD appears with a new name "Entiminae" (whereas should be "3i Curculio"):

https://www.checklistbank.org/dataset/9910/sourcemetrics

image

GSD page in AC23 shows correct Short Name and logo as "3i Curculio" https://www.checklistbank.org/dataset/9910/source/1166

image

yroskov commented 1 year ago

Example from the CoL of August:

The same GSD Entiminae/former 3i Curculio (id 9910). Authors changed its logo after August release. But new logo appears in August release:

image

Indeed, whatever logo I uploaded in CoL Working Draft in CLB, it will immediately appear in the fixed(!) edition on the portal.

For example, fake logo uploaded to CLB: image

Logo immediately changed in August edition in the portal: image

yroskov commented 1 year ago

So, Alias, Logo (what else from metadata) dynamically feeds from CoL Working Draft in CLB. Supposed to be fixed monthly and annual editions may have incorrectly applied metadata from CoL Working Draft.

mdoering commented 1 year ago

You are right about the logo, that was never archived and there is just a single one we keep per dataset. I have created a new issue to implement a logo archive: https://github.com/CatalogueOfLife/backend/issues/1245

mdoering commented 1 year ago

The metrics page uses this call to retrieve all datasets in one go: https://api.checklistbank.org/dataset?limit=1000&contributesTo=9910&sortBy=alias

This does not query the archive, but takes the live dataset. @thomasstjerne, can we change that to use this resource instead? https://api.checklistbank.org/dataset/9910/source

mdoering commented 1 year ago

@thomasstjerne logos are archived for each release. Does the UI access them via the release datasetKey like this?

/dataset/{releaseKey}/logo/source/{sourceDatasetKey}