Open gdower opened 2 months ago
Yes, the local namespace isn't great. It happens during import of ZooBank though, not in the XRelease which just copies them over. Here is the source record: https://api.checklistbank.org/dataset/2037/taxon/d8167df7-42af-4dd2-bc49-5aae7f11a500
Which is based on this verbatim record: https://www.checklistbank.org/dataset/2037/verbatim/242319
For the XRelease I would think removing local identifiers makes sense and making sure that nomenclator identifiers are added with their proper namespace.
Unless ZooBank shares a different dwca I don't know how to improve this. The scientificNameID is taken as an alternative name id, but there is no scope. I could maybe block it from alt ids in case the name has the exact same identifier like we have here
So the dwca:ID needs their urn:lsid:zoobank.org:act:
namespace added onto it or else it gets namespaced as local:
by the clb importer? Or should they be putting the namespace on all of their IDs like WoRMS?
https://api.checklistbank.org/dataset/2037/verbatim?q=d8167df7-42af-4dd2-bc49-5aae7f11a500
Is that what the identifier without scope
issue means?
Perhaps having a dataset configuration option for ID namespace would be useful (like the dataset option for adding extinct to all values?). Then for alternativeID
in other datasets, we'd always need to put the namespace especially if its not the dataset's namespace.
yes, identifier without scope means that it is just a local id. dwca:ID is not the source of the problem though, it is dwc:scientificNameID or in ColDP it is the coldp:alternativeID fields. The main IDs are expected to be local.
In the xrelease, I see ZooBank identifiers like this:
local:d8167df7-42af-4dd2-bc49-5aae7f11a500
https://api.checklistbank.org/dataset/301904/nameusage/DV3JPIt seems like it should be prefixed with their lsid namespace instead of
local:
?urn:lsid:zoobank.org:act:D8167DF7-42AF-4DD2-BC49-5AAE7F11A500
https://zoobank.org/NomenclaturalActs/d8167df7-42af-4dd2-bc49-5aae7f11a500As far as I can tell, the
local:
namespace gets added by the backend:https://www.checklistbank.org/dataset/2037/verbatim?q=D8167DF7-42AF-4DD2-BC49-5AAE7F11A500
I might also start using COLDP
alternativeID
for Systema Dipterorum in order to add the ZooBank IDs soon, although Systema Dipterorum is in the process of adding the ZooBank IDs.