bioversity / dataverse-dashboard-curation

Dashboard helping for the metadata curation on Dataverse
MIT License
1 stars 1 forks source link

Improving the metadata on Bioversity Dataverse #8

Open gubi opened 6 years ago

gubi commented 6 years ago

As requested via mail:

Looking at the Metadata tool output we feel that there are 2 priority areas namely adding AGROVOC term-id's to the keywords and adding the geographical coverage to the metadata records.

Keywords

Would it be possible to extract all unique keywords from the Bioversity Dataverse, check these terms against AGROVOC and, if an exact match is found in AGROVOC, add the corresponding AGROVOC term-id to the relevant Dataverse entries?

Coverage (geographical)

Would it be possible to use the existing metadata e.g. the dataset title and description field, to (semi-)automaticly extract e.g. the country names and insert these in the geographical coverage field?

NB any changes made by these scripts would still be checked and approved by a human before the modified records are posted/published!

gubi commented 5 years ago

Regarding the geospatial coverage, in the BioversityDataset there's a dedicated field datasetVersion › metadataBlocks › geospatial › fields › .... I guess you're talking about this. I think is not possible starting from a simple "string" value, unless using AI or another external dataset... I mean, is very complicated and requires a lot of code for text parsing.

Please, can you share the output in order to let me analyze it?

gubi commented 5 years ago

Created https://github.com/gubi/bioversity_agrovoc-indexing repository. Waiting for transfership.