pulibrary / cicognara-catalogo

TEI encoding of Cicognara's catalogo ragionato
4 stars 1 forks source link

MARC records with no corresponding item in Catalogo #35

Open cwulfman opened 8 years ago

cwulfman commented 8 years ago

There are 14 MARC records (attached) in cicognara-catalogo/cicognara.mrx.xml whose dclnums do not appear in the TEI-encoded Catalogo.

There are 4704 MARC records in the cicognara.mrx.xml file in the Catalogo GitHub repository (https://github.com/pulibrary/cicognara-catalogo). All 4704 records appear to have dclnums associated with them:

count(//record[.//subfield[@code='2' and . = 'dclib']])  ==> 4704

There are 14 MARC records whose dclib number does not appear in catalogo.tei.xml.

let $corresp-values := doc('/db/cicognara-data/catalogo.tei.xml')//tei:item/@corresp

let $dcl-in-tei := for $value in $corresp-values return tokenize($value, ' ')

let $recs := collection('/db/cicognara-data')//marc:record

let $not-in-tei :=
for $rec in $recs
    let $dclib-field := $rec/marc:datafield[@tag='024' and ./marc:subfield[@code='2'] = 'dclib']
    let $dclibnum := $dclib-field/marc:subfield[@code = 'a']
    where not($dclibnum = $dcl-in-tei)
    return $rec

These, I believe, are the records to focus on first: if these 14 can be resolved, then all the MARC records will be linked to something in the Catalogo, and the shot-list can be considered complete.

We can hope that each of these 14 can be matched to one (or more) of the 672 items in the Catalogo that lack dclib numbers. Those are probably harder to resolve, but if I understand the situation, resolving them is not necessary for submitting the shot-list, as the task at hand is digitizing the fiche, and the shot list is derived from cicognara.mrx.xml.

marc-not-in-tei.xml.gz

ma20 commented 8 years ago

I've taken one go at it so far and resolved two of the fourteen: bib ID 8494058 is the record for Cicognara number 128, and bib ID 8546119 is the record for Cicognara entry number 1200. I think 8546117 corresponds to Cicognara entry number 1319, but am not yet positive. 1319 lacks a DCLib number and the titles in the Catalogo and MARC records are similar, although not identical. This is a common discrepancy, especially with items of this type--a speech delivered in 1776.

The other items, at a glace, appear to be ephemeral: art prints, newspaper clippings, and the like. That will take more digging.

cwulfman commented 8 years ago

@ma20 and I agree that the remaining 12 from the list above correspond with the Miscellaneous leaflets and drawings mentioned in the documentation accompanying the fiche (with a volume number and an "interno number" identifying the work).

@ma20 will investigate with which Cicognara items these miscellany belong and edit the TEI-encoded Catalogo accordingly. Meanwhile, they should not interfere with the generation of the shot list: these 12 items might be added as an appendix to the list, each with its dclnum and the bibid.

@jpstroop , could you please comment on the feasibility of such an appendix to the shot list? If this strategy sounds feasible, this ticket should be retained but not included as part of the countdown to fiche digitization.

jpstroop commented 8 years ago

I think it's up to Sandy whether we want that stuff digitized or not. She's already commented to me that there are a few vols from the Fondo that are in the Fiche set that shouldn't be part of the project, so I'm guessing these would fall into that category. (I think I have that all straight...)

@ma20 --I'm out of town and think this might be easier to explain face to face than in an email. Would you mind running this by Sandy? If she wants them, I'm sure we can add them to the shot list.

ma20 commented 8 years ago

@jpstroop I spoke to Sandy. We do want these on the shotlist, so let's go ahead and add them.

That being said, their role in the final application is still up in the air. It will likely be as an appendix to the Catalogo as Cliff said, but Sandy wants to check on each physical item in the Vatican Library when she is there later this month. We will know more about where they fit into the Catalogo after we learn more about their nature. @jpstroop is absolutely correct about the distinction between the Catalogo (which was made in 1821) and the Fondo (made in 1830) but these appear to relate to Catalogo items and not Fondo items. Does that make sense?

jpstroop commented 8 years ago

@ma20 Sounds good. As long as they're in the updated list of bib ids I assume I'll be receiving from you, and those bib records all have dclib numbers, I can update the shotlist code to include them even though they're not referenced in the TEI.