PerseusDL / catalog_data

MODS and MADS data for the Perseus Catalog
13 stars 12 forks source link

MODS records and CITE URNs #122

Open cwulfman opened 6 years ago

cwulfman commented 6 years ago

I'm puzzling again about the MODS records and their identifiers, and I find myself asking why they don't have CITE URNs.

If I understand this correctly (from reading the Homer Multitext Documentation), CTS URNs refer to texts:

They [CTS URNs] provide the permanent canonical references to texts or passages of text, and are used by Canonical Text Services (CTS) to identify or retrieve passages of text.

and CITE URNs refer to bibliographic carriers:

They [CITE URNs] provide the permanent canonical references to discrete objects (whether physical or notional), and are used by the CITE Collection service to identify and retrieve digital representations of those objects.

So it makes sense that the Argonautica of Apollonius Rhodius would have a CTS URN (urn:cts:greekLit:tlg0001.tlg001). It also makes sense that the text of the Laurentian MS as revised by R. Merkel would have a CTS URN: it doesn't have a CTS URN at the moment, because it hasn't been digitized by the project, but that URN would be something like urn:cts:greekLit:tlg0001.tlg001.rm; the Laurentian MS text would have its own CTS URN as well (e.g., urn:cts:greekLit:tlg0001.tlg001.lms). And Edward P. Coleridge's English prose translation would have a CTS urn (perhaps urn:cts:greekLit:tlg0001.tlg001.epc).

But the published volume identified with CTS urn urn:cts:greekLit:tlg0001.tlg001.opp-eng1 is not a text. It is a discrete bibliographic object that happens to carry urn:cts:greekLit:tlg0001.tlg001.epc. From what I can understand from the documentation, the CITE URN format requires a collection namespace and a work identifier; the CITE URN for this edition might be

urn:cite:perseus:tlg0001_tlg001_coleridge_1889

but could also be a NOID or a UUID:

urn:cite:perseus:13030/f54x54g11

I'm not sure how MODS would represent the relationship between the bibliographic object it is describing (the carrier) and the text or texts being carried; perhaps something using , or escaping altogether and using .

I don't think we need to solve this right now, but I'd like to get your thoughts on it.

AlisonBabeu commented 6 years ago

Oh my goodness, the return of CITEURNs and FRBR. Your analysis is quite right and in all honesty this is one reason I have always called the catalog, FRBR inspired, not FRBR compliant, because in the end, what we have represented through our various CTS URNS is the manifestation level carrier of various expressions of works, because in the end the Perseus Catalog (I swear) really did start as a finding aid.

I believe there are several ways that MODS can represent the relationship between a bibliographic object and a notional work but I'll have to look into them more in the next few days.

cwulfman commented 6 years ago

I think you raise a really important point, @AlisonBabeu : the Perseus Catalog started out as a finding aid, but its purpose has evolved. MODS is a schema for representing bibliographic items, as we've been saying -- really good for that, but as such it is not well suited either for finding aids in general or for the specific sort of relationships Perseus seems to want to express, which are generally at a more abstract level.

I've started a wiki page over in the Perseus_catalog repo: https://github.com/PerseusDL/perseus_catalog/wiki/Functional-Requirements. Now that I've waded deep into the metadata marsh, I find myself bogged down, and it isn't clear what to do next.

AlisonBabeu commented 6 years ago

I worry sometimes that its not so much a metadata marsh as metadata quicksand.... or maybe a rabbit hole.

cwulfman commented 6 years ago

@AlisonBabeu @gregorycrane any thoughts on the question of minting identifiers for items in the new Perseus Catalog? Any objections to my setting up an account with https://ezid.cdlib.org?

AlisonBabeu commented 6 years ago

Hi @cwulfman, I'm assuming this would be persistent identifiers for the MODS records as manifestations (but not a replacement for CTS URNs?, Yes?)

cwulfman commented 6 years ago

That's right: if I'm reading the docs correctly (e.g., http://www.homermultitext.org/hmt-doc/cite/cite-urn-overview.html), bibliographic items should be cite-able (duh?) and therefore should have CITE URNs. Many of these items have OCLC identifiers, but not all of them, alas.