CatalogueOfLife / xcol

Working towards the extended Catalogue of Life Checklist
0 stars 0 forks source link

Assign an alias to all extended sources #50

Open mdoering opened 1 year ago

mdoering commented 1 year ago

All sources used in the xrelease should have an alias which is better than just the id shown in places like the source metrics: http://coltest-vh.catalogueoflife.org/dataset/9913/sourcemetrics

DianRHR commented 1 year ago

the source metrics in coltest doesn't display the list of sources:

image

I got some of the sources from a download and found these "alias" in the tree/browse page, and then I looked directly into the name records where the alias was not clear: <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">

sourceID | Alias in CLB | Extended source name -- | -- | -- 2037 | Zoobank | Zoobank 1174 | PaleoBioDB | Paleobiology Database 2006 | IPNI | IPNI 127379 | AFD | Australian Faunal Directory 2047 | 2047 | Danish Mycological Society - Checklist of Fungi: 6282 | Myriatrix | Myriatrix \| Scratchpads 2262 | LCVP | The Leipzig catalogue of vascular plants 37384 | BOLD | BOLD

But didn't find a way to check all the sources. What about you @camiplata ?

camiplata commented 1 year ago

Here is a doc with the alias for 44 current sources that already had an alias, and with a suggested alias for 61 sources without alias:

https://docs.google.com/spreadsheets/d/1CU3rR_RC848YnUoe7JpaPfrfRSI3b3-OkVYiAXESjwQ/edit?usp=sharing

I didn't include information for organizations, also note that some of the datasets sources maybe replaces by a organization source like the German datasets so probably not all alias will need to be used.

mdoering commented 1 year ago

Probably best to work on is http://coltest-vh.catalogueoflife.org/dataset/9913/sourcemetrics

camiplata commented 1 year ago

Use shorter alias, and add the using the PATCH function for the sources under a project for now only on the test environment

Captura de pantalla 2023-08-14 a la(s) 8 51 32 a m
DianRHR commented 1 year ago

A list with the sources of the GBIF Backbone is in this google sheet in the first one. The second sheet has a list of the sources of the COL draft project (obtained from COL project source metrics). This is useful to have clear which sources have an alias assigned.
Besides, we can add more sources in this list before adding them to the COL draft project.

mdoering commented 1 year ago

Thanks for the list. When you spot changes that need to be done please keep the COL project on coltest as the main source of truth at this stage. Once we are pleased with first xcol results we can move sectors and source dataset patches (for the aliases) to prod. The sheet should only help you to improve coltest and not become a reference point for the future.

mdoering commented 3 months ago

can you review all sources in prod?

mdoering commented 3 months ago

also add a logo if possible

mdoering commented 1 month ago

we should harmonize all aliases to follow the same naming conventions. Right now we have spaces, underscores and hyphens in use:

image

Let just use spaces and Camel Case, not all upper.