crkn-rcdr / Digital-Preservation

Documentation and related schemas for the CRKN digital preservation system
3 stars 0 forks source link

Only use Dublin Core for Archival descriptive metadata #6

Closed RussellMcOrmond closed 1 year ago

RussellMcOrmond commented 5 years ago

Canadiana currently uses 3 different formats for descriptive metadata (See: CMETS):

Archivematica only uses Dublin Core. The proposal is for Canadiana to only use Dublin Core for archival description.

This will involve a lossy conversion from MARCXML and Issueinfo. The MARC records will be used directly by our access platform, and we will support an Issueinfo import.

ernieejo commented 5 years ago

What happens to the UTF-8 character encoding in a lossy conversion? It's important to keep that default coding as it impacts the display of diacritics.

RussellMcOrmond commented 5 years ago

@ernieejo , I don't think there will be loss as far as character encoding is concerned. What this relates to is that there is more data in a MARCXML or Issueinfo record than what Dublin Core supports, so not everything will be stored in the preservation platform. Everything will still be displayed on the access platform which will be using the rich records.

@nataliemacdonald may have additional comments.

RussellMcOrmond commented 3 years ago

@nataliemacdonald , I briefly looked at https://www.archivematica.org/en/docs/archivematica-1.13/user-manual/transfer/import-metadata/

Elements beyond Dublin Core will end up as MDTYPE="OTHER" , and if we wanted to use those in some way it would be custom software on our end. We'll want to think about if and how we might want to use that.

Same with https://www.archivematica.org/en/docs/archivematica-1.13/user-manual/transfer/import-metadata/#importing-other-types-of-metadata which allows other metadata formats to be included as files, but wouldn't be used as regular Archivematica metadata (in searches/etc), and wouldn't be part of the DIP transfer to the access platform unless we did something custom.

https://www.archivematica.org/en/docs/archivematica-1.13/user-manual/metadata/METS/ describes using a METS file as part of the transfer, which is also interesting...

Excited to be moving forward on this.

RussellMcOrmond commented 1 year ago

This is no longer a proposal, but the current planning. Prior to attempting to migrate SIP data from the custom CIHM packaging format to Archivematica will be converting all existing dmdSec records to Dublin Core.

Tools are being authored to automate as much of this as we can.

Note to anyone reading:

Now that we have completed our Preservation-Access Split project, the metadata stored in Preservation doesn't need to be (and regularly shouldn't be) the same as what is stored in Access. While everything in Preservation will be Dublin Core, Titles in Access will predominantly be described using MARC. Metadata for access will advance independent of what metadata is stored for preservation.

FYI to @jmacgreg so he knows there are sometimes members and patrons who read these discussions.

RussellMcOrmond commented 1 year ago

Additional note: We have added support for repeating columns for Dublin Core CSV files, to be closer to what Archivematica supports. https://github.com/crkn-rcdr/cihm-metadatabus/issues/64

I'm told some time in the new year that the Heritage Services team will be meeting to end any use of IssueInfo and use Dublin Core in those instances, now that the custom fields in IssueInfo relating to relationships is no longer used (that relationship information is stored in IIIF collection data, not within custom descriptive data)

RussellMcOrmond commented 1 year ago

Crosswalks are part of other tickets.

Ingesting of new AIPs only supports Dublin Core, so this specific issue can be marked as completed.