ntra00 / marc2bibframe

Convert marc to BIBFRAME 1.0 - see lcnetdev/marc2bibframe2 for current release
http://www.loc.gov/bibframe/
Other
64 stars 20 forks source link

MARC 505 0 0 #58

Open timathom opened 10 years ago

timathom commented 10 years ago

Currently, titles from an enhanced 505 contents note (subfield $t) are being mapped to bf:contains/bf:Work. Subfield $g is being mapped to bf:note.

I have a record for a book set that has six volumes:

<marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
    <marc:record type="Bibliographic">
        <marc:leader>01713cam a2200373 i 4500</marc:leader>
        <marc:controlfield tag="001">7703201</marc:controlfield>
        <marc:controlfield tag="005">20140508170225.0</marc:controlfield>
        <marc:controlfield tag="008">130819s2008    bl a     b    000 0 por d</marc:controlfield>
        <marc:datafield tag="020" ind1=" " ind2=" ">
            <marc:subfield code="a">9788599535974 (vol. 1)</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="020" ind1=" " ind2=" ">
            <marc:subfield code="a">9788599535981 (vol. 2)</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="020" ind1=" " ind2=" ">
            <marc:subfield code="a">9788599535998 (vol. 3)</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="020" ind1=" " ind2=" ">
            <marc:subfield code="a">9788579020018 (vol. 4)</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="020" ind1=" " ind2=" ">
            <marc:subfield code="a">9788579020025 (vol. 5)</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="020" ind1=" " ind2=" ">
            <marc:subfield code="a">9788579020032 (vol. 6)</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="040" ind1=" " ind2=" ">
            <marc:subfield code="a">NjP</marc:subfield>
            <marc:subfield code="b">eng</marc:subfield>
            <marc:subfield code="e">rda</marc:subfield>
            <marc:subfield code="c">NjP</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="050" ind1=" " ind2="4">
            <marc:subfield code="a">HE2928</marc:subfield>
            <marc:subfield code="b">.C35 2008</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="043" ind1=" " ind2=" ">
            <marc:subfield code="a">s-bl---</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="245" ind1="0" ind2="0">
            <marc:subfield code="a">Caminhos do trem :</marc:subfield>
            <marc:subfield code="b">apogeu, decadência e retomada da ferrovia no Brasil.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="246" ind1="1" ind2=" ">
            <marc:subfield code="i">Title on container:</marc:subfield>
            <marc:subfield code="a">Coleção caminhos do trem :</marc:subfield>
            <marc:subfield code="b">apogeu, decadência e retomada da ferrovia no Brasil</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="264" ind1=" " ind2="1">
            <marc:subfield code="a">São Paulo - SP :</marc:subfield>
            <marc:subfield code="b">Duetto Editorial,</marc:subfield>
            <marc:subfield code="c">[2008]</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="300" ind1=" " ind2=" ">
            <marc:subfield code="a">6 volumes :</marc:subfield>
            <marc:subfield code="b">illustrations (chiefly color) ;</marc:subfield>
            <marc:subfield code="c">28 cm</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="336" ind1=" " ind2=" ">
            <marc:subfield code="a">text</marc:subfield>
            <marc:subfield code="2">rdacontent</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="336" ind1=" " ind2=" ">
            <marc:subfield code="a">still image</marc:subfield>
            <marc:subfield code="2">rdacontent</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="337" ind1=" " ind2=" ">
            <marc:subfield code="a">unmediated</marc:subfield>
            <marc:subfield code="2">rdamedia</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="338" ind1=" " ind2=" ">
            <marc:subfield code="a">volume</marc:subfield>
            <marc:subfield code="2">rdacarrier</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="500" ind1=" " ind2=" ">
            <marc:subfield code="a">"A trajetória das estradas de ferro brasileiras e seu impacto no desenvolvimento econômico e integração nacional"--Container.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="504" ind1=" " ind2=" ">
            <marc:subfield code="a">Includes bibliographical references.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="505" ind1="0" ind2="0">
            <marc:subfield code="g">1.</marc:subfield>
            <marc:subfield code="t">Origens : nos trilhos do café --</marc:subfield>
            <marc:subfield code="g">2.</marc:subfield>
            <marc:subfield code="t">Grandes ferrovias : malha ferroviária --</marc:subfield>
            <marc:subfield code="g">3.</marc:subfield>
            <marc:subfield code="t">Locomotivas e vagões : do vapor ao elétrico --</marc:subfield>
            <marc:subfield code="g">4.</marc:subfield>
            <marc:subfield code="t">A conquista do território : paisagens recriadas --</marc:subfield>
            <marc:subfield code="g">5.</marc:subfield>
            <marc:subfield code="t">Profissão ferroviário --</marc:subfield>
            <marc:subfield code="g">6.</marc:subfield>
            <marc:subfield code="t">De volta aos trilhos : expresso para o futuro.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="650" ind1=" " ind2="0">
            <marc:subfield code="a">Railroads</marc:subfield>
            <marc:subfield code="z">Brazil</marc:subfield>
            <marc:subfield code="x">History.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="650" ind1=" " ind2="0">
            <marc:subfield code="a">Railroads</marc:subfield>
            <marc:subfield code="z">Brazil</marc:subfield>
            <marc:subfield code="x">History</marc:subfield>
            <marc:subfield code="v">Pictorial works.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="700" ind1="1" ind2=" ">
            <marc:subfield code="a">Vasquez, Pedro,</marc:subfield>
            <marc:subfield code="e">editor.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="730" ind1="0" ind2=" ">
            <marc:subfield code="a">História viva.</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="904" ind1=" " ind2=" ">
            <marc:subfield code="a">sla</marc:subfield>
            <marc:subfield code="b">o</marc:subfield>
            <marc:subfield code="h">n</marc:subfield>
            <marc:subfield code="c">b</marc:subfield>
            <marc:subfield code="e">20130819</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="902" ind1=" " ind2=" ">
            <marc:subfield code="a">tat</marc:subfield>
            <marc:subfield code="b">o</marc:subfield>
            <marc:subfield code="6">a</marc:subfield>
            <marc:subfield code="7">m</marc:subfield>
            <marc:subfield code="d">v</marc:subfield>
            <marc:subfield code="f">6</marc:subfield>
            <marc:subfield code="e">20140508</marc:subfield>
        </marc:datafield>
    </marc:record>
    <marc:record type="Holdings">
        <marc:leader>00230cv  a22001094  4500</marc:leader>
        <marc:controlfield tag="001">7504272</marc:controlfield>
        <marc:controlfield tag="004">7703201</marc:controlfield>
        <marc:controlfield tag="005">20140508105435.0</marc:controlfield>
        <marc:controlfield tag="008">1308190p    8   1001uu   0000000</marc:controlfield>
        <marc:datafield tag="852" ind1="0" ind2=" ">
            <marc:subfield code="b">f</marc:subfield>
            <marc:subfield code="h">HE2928</marc:subfield>
            <marc:subfield code="i">.C35 2008</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="866" ind1=" " ind2="0">
            <marc:subfield code="a">1-6</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="866" ind1=" " ind2="0">
            <marc:subfield code="x">DESIGNATOR: vol.</marc:subfield>
        </marc:datafield>
    </marc:record>
</marc:collection>

When I run the conversion, I get a top-level bf:Work for the set, and then six bf:contains/bf:Work elements for the individual volumes. In this case, it seems to make sense to model the individual volumes as bf:Work, but will that always be the case? Are catalogers consciously making this kind of semantic distinction when they create enhanced 505 notes? Non-formatted 505 notes are not being modeled as bf:Work, but are mapped to bf:contentsNote.

Also, in the current conversion, the first subfield $g is getting omitted, which throws off the numbering (i.e., the first bf:contains/bf:Work below should have <bf:note>1.</bf:note>, not <bf:note>2.</bf:note>):

<bf:contains>
    <bf:Work>
        <bf:authorizedAccessPoint>Origens : nos trilhos do café
            --</bf:authorizedAccessPoint>
        <bf:title>Origens : nos trilhos do café --</bf:title>
        <bf:note>2.</bf:note>
    </bf:Work>
</bf:contains>
<bf:contains>
    <bf:Work>
        <bf:authorizedAccessPoint>Grandes ferrovias : malha ferroviária
            --</bf:authorizedAccessPoint>
        <bf:title>Grandes ferrovias : malha ferroviária --</bf:title>
        <bf:note>3.</bf:note>
    </bf:Work>
</bf:contains>
<bf:contains>
    <bf:Work>
        <bf:authorizedAccessPoint>Locomotivas e vagões : do vapor ao elétrico
            --</bf:authorizedAccessPoint>
        <bf:title>Locomotivas e vagões : do vapor ao elétrico --</bf:title>
        <bf:note>4.</bf:note>
    </bf:Work>
</bf:contains>
<bf:contains>
    <bf:Work>
        <bf:authorizedAccessPoint>A conquista do território : paisagens recriadas
            --</bf:authorizedAccessPoint>
        <bf:title>A conquista do território : paisagens recriadas --</bf:title>
        <bf:note>5.</bf:note>
    </bf:Work>
</bf:contains>
<bf:contains>
    <bf:Work>
        <bf:authorizedAccessPoint>Profissão ferroviário --</bf:authorizedAccessPoint>
        <bf:title>Profissão ferroviário --</bf:title>
        <bf:note>6.</bf:note>
    </bf:Work>
</bf:contains>
<bf:contains>
    <bf:Work>
        <bf:authorizedAccessPoint>De volta aos trilhos : expresso para o
            futuro.</bf:authorizedAccessPoint>
        <bf:title>De volta aos trilhos : expresso para o futuro.</bf:title>
    </bf:Work>
</bf:contains>

Finally, this record contains an 020 field with an ISBN for each volume. Each 020 causes a new bf:hasInstance/bf:Instance element to get generated. However, there is no correlation between these instances and the bf:Work elements generated from the titles in 505 $t. It seems incorrect to assert that each individual volume is an instance of the top-level bf:Work. Shouldn't the instances in this case be mapped to the bf:contains/bf:Work elements? There may not be a reliable way to do this, but with this record, at least, the volume numbers following the ISBNs could be matched to the designators in subfield $g.

ntra00 commented 10 years ago

Are catalogers consciously making this kind of semantic distinction when they create enhanced 505 notes? Yes, I believe it's track titles, chapter headings, etc volumes, each of which are works in their own right. When contents are not broken out nicely, we don't have enough info to build a work, so it's a literal text field, hence the contentsNote. "Also, in the current conversion, the first subfield $g is getting omitted, which throws off the numbering . " I'll fix this. Your last point about the 020's is interesting. I did the work to break out the instances, but I don't think I finished the linkage between 505s and 020s. [note to self: look for "experimental 505a matching to isbn".] In your case, there's not an exact match in the text of the 020 and the 505 to make a link, although a human can figure out that "9788579020018 (vol. 4)" matches "4." from 505$g . I think I made it work in some cases but it resulted in too much crud in others. I'll revisit that code, and maybe have a toggle for "attempt to link 020 and 505", so that systems can opt in if they know they have clean data.

kiegel commented 9 years ago

Field 505 00 has two purposes. One is provide better access (than in a 505 0_) to author and title information within a contents note. This is accomplished in BF by creating separate Works for each entry.

A second purpose is to provide a human-readable display of contents information for inclusion in a bib description. This is not available now, and it would be hard to reassemble work entries back into a contents note. To remedy the problem, we suggest also copying the 505 00 into bf:contentsNote (without subfields delimiters). In this way, there would be a parallel outcome in terms of display, whether a cataloger chose to encode a contents note in basic or enhanced format.

kiegel commented 9 years ago

The parsing of $g can be off: it does not always appear before $t. Although it is messy, the true parsing is to put all of the subfields between double dashes together in a single work. Here is an example where it goes wrong.

505 00 |t The triumph of time |g (29:38) -- |t Ritual fragment |g (11:23) -- |t Gawain's journey |g (24:37)

<http://example.org/99161770439001452work25> a bf:Work ;
    bf:title "The triumph of time" .

<http://example.org/99161770439001452work26> a bf:Work ;
    bf:note "(29:38)" ;
    bf:title "Ritual fragment" .

<http://example.org/99161770439001452work27> a bf:Work ;
    bf:note "(11:23)" ;
    bf:title "Gawain's journey" .

Durations are associated with the wrong works.

(OCLC # 878532352)

ntra00 commented 9 years ago

Well, I had it with $t first, then it was found that $g came first: https://github.com/lcnetdev/marc2bibframe/issues/89, now you have some with $t first again.

kiegel commented 9 years ago

Double dashes at the end of bf:title in works can be stripped.

505 00 |t Blew -- |t Floyd the barber -- |t About a girl -- |t School -- |t Love buzz -- |t Paper cuts -- |t Negative creep -- |t Scoff -- |t Swap meet -- |t Mr. Moustache -- |t Sifting -- |t Big cheese -- |t Downer -- |g Live, Feb. 9, 1990, Pine Street Theatre, Portland, Ore. |t Intro ; |t School ; |t Floyd the barber ; |t Dive ; |t Love buzz ; |t Spank thru ; |t Molly's lips ; |t Sappy ; |t Scoff ; |t About a girl ; |t Been a son ; |t Blew.

<http://example.org/99137969100001452work16> a bf:Work ;
    bf:title "Blew --" .

<http://example.org/99137969100001452work17> a bf:Work ;
    bf:title "Floyd the barber --" .

<http://example.org/99137969100001452work18> a bf:Work ;
    bf:title "About a girl --" .

(OCLC # 429048800)