Open regineheberlein opened 4 years ago
@regineheberlein -- for the sake of this useful discussion, would you be willing to give us one or two side-by-side examples of what it would looking like for current archival description to be marked up as structured data?
What I mean is something like the following, which I grabbed randomly from one of our EAD finding aids.
I'll sidestep the question of the final data model and structure standard for the moment, whether that's going to be RiC/RiC-O, or BIBFRAME-ARM, or something else.
Princeton's John James Audubon Collection, beginning only (in the interest of brevity) of the collection-level scopecontent
:
This collection consists of original manuscripts, photostats and transcripts of additional manuscripts, and printed material relating to Audubon, his life and work. [It goes on and on after that, but I think this will do for argument's sake].
Series 2 scopecontent
:
Consists primarily of correspondence to Robert Havell and Mrs. Audubon, 1831-1834.
Item in Series 2: ALS to Robert Havell, Edinburgh; scopecontent
:
Contains instructions for re-engraving legends.
The way the EAD is currently structured, it is basically saying this:
John James Audubon Collection : has part : "Oversize Letters"
John James Audubon Collection :: has part :: "ALS to Robert Havell"
John James Audubon Collection : has description : ”consists of manuscripts and printed material”
John James Audubon Collection :: has [part] description :: ”consists of correspondence to R. Havell and Mrs. Audubon”
John James Audubon Collection ::: has [part] description ::: ”contains instructions for re-engraving legends”
Instead, we might want to position our data so that we can get closer to something like this (excuse the totally fictional ontology):
<URL> : instance of : GLAM resource <url>
<URL> : has name : “John James Audubon Collection”
<URL> : has genre : correspondence <url>
<URL> : has genre : report <url>
<URL> : has physical form : document <url>
<URL> : has physical form : volume <url>
<URL> : has physical form : manuscript <url>
<URL> : has part : <partURL>
<partURL> : has name : ”Oversize Letters”
<partURL> : has physical form : document <url>
<partURL> : has genre : correspondence <url>
<partURL> : has part : <partPartURL>
<partPartURL> : has name : “ALS to Robert Havell”
<partPartURL> : has author : <authorURL>
<partPartURL> : has recipient : <recipientURL>
<partPartURL> : has subject : legends <url>
<partPartURL> : has subject : corrections <url>
<partPartURL> : has subject : engraving <url>
Linked data, in whichever implementation, is the likely and imminent future of resource description. Principle 4 positions DACS for this transition:
How does this affect our thinking about discursive notes (currently implemented in the EAD data structure standard as
scopecontent
,accessrestrict
,bioghist
,arrangement
etc.) going forward? While a content standard is by definition implementation-agnostic, I believe it needs to be aware of and address its own likely implementations.With the transition from hierarchical to linked description, data inheritance is replaced by data inference. Data inference is in turn based primarily on structured data, not discursive text. Any discursive text that is part of the description is itself identified by a URI, and in current implementations such as Wikidata and Wikipedia “earns” its entification by being notable, unique, and verifiable.
OCLC has proposed an elegant solution during their Project Passage pilot, a project testing resource description in a wikibase implementation of linked data:
Based on the developments outlined above, I propose including a recommendation with the DACS revision to address resource description as a network of linked entities, along these lines: