Closed alawvt closed 2 years ago
@pyc1, it seems like we could improve SWORD metadata mapping with just this one crosswalk. If there are metadata fields that we want to add for all submissions, we might be able add them, as well. You could list the changes you have been making to SWORD submissions.
Fields we could add to all SWORD submissions:
dc.format.mimetype[en] application/pdf dc.rights[en] Creative Commons Attribution 4.0 International dc.rights.uri[en] http://creativecommons.org/licenses/by/4.0/ dc.language.iso[en] en dc.type[en] Article - Refereed dc.type.dcmitype[en] Text
BioMed Central/SpringerOpen notes:
-middle initials are missing the period -en language code comes in as dc.language.rfc3066 -DOI comes in as dc.identifier.uri
MDPI notes:
-middle initials have 2 spaces before -DOI comes in as dc.identifier
Hindawi notes:
-DOI comes in as dc.identifier; also add https and delete "dx."
map DOI correctly from both dc.identifier.url and dc.identifier.doi.
I think you mean dc.identifier.uri and dc.identifier. I'm not sure that we get any that go correctly into dc.identifier.doi.
One publisher is also sending us <epdcx:statement epdcx:propertyURI="http://purl.org/dc/terms/idendifier/doi"> epdcx:valueString10.3390/robotics10040109</epdcx:valueString> </epdcx:statement>
.
Tom Gibons from ACM notes that only some of their materials have Creative Commons licenses, so we might not be able to add that for all.
It looks like BioMed Central is still using sword-mets "SWAP Metadata" including eprints terms (xmlns:epdcx="http://purl.org/eprint/epdcx/2006-11-16/"). MDPI and Hindawi also use this packaging. A typical header is,
<mets ID="sort-mets_mets" OBJID="sword-mets" LABEL="DSpace SWORD Item" PROFILE="DSpace METS SIP Profile 1.0" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:epdcx="http://purl.org/eprint/epdcx/2006-11-16/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.loc.gov/METS/">
The DSpace crosswalk for this is defined in dspace/dspace/config/crosswalks/sword-swap-ingest.xsl. This crosswalk is not mapping
These could also be mapped:
It could also be corrected to map to dc.identifier.doi instead of dc.identifier.uri, although this might map non-DOIs. Investigate other SWORD collections to see what mappings they use and if their mappings can be improved.
SWAP Profle, formerly Eprints profile