UKUK-Repository-Dept / ukuk-dspace

DSpace repository of the Charles University, Prague, Czech Republic.
https://dspace.cuni.cz
Other
1 stars 3 forks source link

Metadata enrichment during ingest #39

Open jrihak opened 5 years ago

jrihak commented 5 years ago

Basic idea

DSpace should have a functionality that would allow to enrich metadata during ingest / item creation process.

Example

Map a different document type according to DSpace document typology provided in mapfile

1) Each object has a type described by DSpace document typology. 2) During ingestion, we would like to add a mapped type that would reflect a different document typology (e.g. COAR typology based on a mapfile referenced in standalone configuration file 3) During ingest, DSpace package ingester would check the assigned DSpace document type in dc.type element and mapped a value from different typology that corresponds with assingned DSpace document type 3a) if the assigned DSpace document type was not found in mapfile, ingest would fail with an error (this should prevent typos and generaly not supported DSpace type assignment). Failed ingestion shouldn't trigger batch ingestion failure. 3b) if the assigned DSpace document type was found in mapfile, mapped type would be added to a predefined metadata field of the object and ingest would continue normally 4) At the end of an ingestion process, DSpace should remind admin about any errors during ingestion due to the failed mapping process.

Implementation notes

DSpace accepts packages at several formats. It would be nice if the metadata enrichment process would be able to look for dc.type metadata within different input formats or chose a correct way to search for metadata according to package format.

jitka commented 5 years ago

https://wiki.duraspace.org/display/DSDOC5x/Importing+and+Exporting+Items+via+Simple+Archive+Format#ImportingandExportingItemsviaSimpleArchiveFormat-AddingItemstoaCollectionfromadirectory org.dspace.app.itemimport.ItemImport