uwlib-cams / MARC2RDA

mapping between MARC21 and RDA-RDF
Creative Commons Zero v1.0 Universal
36 stars 3 forks source link

651 subject added entry--geographic name #246

Open CECSpecialistI opened 3 years ago

CECSpecialistI commented 3 years ago

https://github.com/uwlib-cams/MARC2RDA/blob/main/Working%20Documents/6XX.csv

szapoun commented 1 month ago

@cspayne Added lines for breaking $v$x$y$z when $2=fast. Please check

cspayne commented 1 month ago

The mapping shows that we are adding an authorized access point to IRIs provided in $0's and $1's for places with the authorized access point being the combined values of $a and $g. I have two questions related to this:

  1. Why are we not minting a nomen for the authorized access point here, while we are for the authorized access point when we mint our own IRI?
  2. Why is the authorized access point $a and $g here but only $a when we mint our own IRI?

I suggest that we decide whether to use $a and $g or only $a and be consistent, as well as decide how these values should be combined (separated by a space or by --). We should also be consistent in minting a nomen for an authorized access point when we have a source and using a string value as an access point (not authorized) with no source.

If we are in agreement about this, I can update the mapping and the code.

GordonDunsire commented 1 month ago

@cspayne:

  1. Don't know. We should be consistent and mint a nomen if there is source so we can preserve the source information.
  2. Don't know, and there are no examples of subfield g that I can look at. From the definition alone, it looks unsafe to include the subfield in the heading for similar reasons to other fields with "miscellaneous" or "not accommodated in another subfield" semantics. To be consistent we should just map subfield $a.
pennylenger commented 1 month ago

@cspayne Hi Cypress, for $0's and $1,

The mapping shows that we are adding an authorized access point to IRIs provided in $0's and $1's for places with the authorized access point being the combined values of $a and $g. 1, If we have no qualifier subfields, then AP/AAP is relevant, and it is useful to add the RDA AAP statement to the subfield $1 IRI as subject, with the value as a string object, no nomen, because we already 'know' the source from the IRI. (Gordon suggested this in previous #Subject Heading discussion)

2, We just map $a.

rdawo:P10321 <$1> . // has subject place <$1> rdapd:P70045 ($a). // has authorized access point for place
cspayne commented 4 weeks ago

Here's one example from gnd for Mexico City:

<marc:datafield tag="651" ind1=" " ind2="7">
            <marc:subfield code="a">Mexiko</marc:subfield>
            <marc:subfield code="g">Stadt</marc:subfield>
            <marc:subfield code="2">gnd</marc:subfield>
</marc:datafield>
GordonDunsire commented 3 weeks ago

@cspayne: The example of subfield $g is bad data, and unfortunately it may be a hint of a larger problem. The value 'stadt' is German for 'city', so the intended place is 'Mexico City' or 'Mexico (city)'. In both cases these should be recorded as subfield $a: the MARC 21 manual explicitly states for subfield $a 'Parenthetical qualifying information is not separately subfield coded'. I am wondering if this is an import artefact rather than a simple cataloguer error, arising from the inclusion of DNB authority data in M21. But we don't need to know how it has happened because the result for the transform is the same: only use $a for the AAP, as the manual instructs.

In the case of this example, this will output Mexico the country instead of Mexico City, but it is not invalid because Mexico City the place is part of Mexico the place :-)

If we can determine that if source is 'gnd' then subfield $g is a placename qualifier, then the values can be concatenated, but I think we have run out of time to make this a safe decision. There is an alternative: concatenate subfields $a and $g, but output the value as an AP, not an AAP.

cspayne commented 3 weeks ago

I haven't found any with subfield $g that aren't from gnd, but I found a few more examples from gnd:

        <marc:datafield tag="651" ind1=" " ind2="7">
            <marc:subfield code="a">Südafrika</marc:subfield>
            <marc:subfield code="g">Kontinent</marc:subfield>
            <marc:subfield code="2">gnd</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="651" ind1=" " ind2="7">
            <marc:subfield code="a">Griechenland</marc:subfield>
            <marc:subfield code="g">Altertum</marc:subfield>
            <marc:subfield code="2">gnd</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="651" ind1=" " ind2="7">
            <marc:subfield code="a">East End</marc:subfield>
            <marc:subfield code="g">London</marc:subfield>
            <marc:subfield code="2">gnd</marc:subfield>
        </marc:datafield>

@AdamSchiff do you know if $g is commonly used as part of the access point for gnd? If Adam doesn't know then I agree with Gordon, we should just use $a at this point.

AdamSchiff commented 3 weeks ago

If you look at a GND authority, the $g is being used as the qualifier to a heading, and I think it's an integral part of the authorized access point. You should not ignore it, as without the qualifier, the meaning is often ambiguous.

For example, for East End of London, here's the GND authority: https://d-nb.info/gnd/4138902-5

[cid:a79b446c-47eb-43ce-b087-0d8d9f2fcdf6] You can see that the authorized access point is East End (London), and the variant access point is London- East End. If you look at it in MARC21 you can see the encoding does use $g for the qualifier (London):

<?xml version="1.0" encoding="UTF-8"?>

00000nz a2200000nc 4500 041389026 DE-101 20200527132957.0 880701n||azznnaabn | ana |c 4138902-5 http://d-nb.info/gnd/4138902-5 gnd (DE-101)041389026 (DE-588)4138902-5 (DE-588c)4138902-5 v:zg DE-101 DE-101 r:DE-101 ger 1150 gnd1 XA-GB g gndgen gik gndspec g s w 2 4215 d:2 t:2020-05-27 23/ger East End London London- East End (DE-101)040743357 (DE-588)4074335-4 https://d-nb.info/gnd/4074335-4 London adue https://d-nb.info/standards/elementset/gnd#hierarchicalSuperiorOfPlaceOrGeographicName r Ueberordnung M unter London swd g London- East End (DE-588c)4138902-5

I looked for other GND authorities by searching in Wikidata, and it appears that $g is used regularly for qualifiers for city sections. For example for the Schöneberg neighborhood in Berlin (https://d-nb.info/gnd/4106612-1):

<?xml version="1.0" encoding="UTF-8"?>

00000nz a2200000nc 4500 04106612X DE-101 20221011120125.0 880701n||azznnaabn | ana |c 4106612-1 http://d-nb.info/gnd/4106612-1 gnd 7290254 geonames E 013 20 51 E 013 20 51 N 052 29 01 N 052 29 01 geonames https://sws.geonames.org/7290254 A:agx E013.347699 E013.347699 N052.483879 N052.483879 geonames https://sws.geonames.org/7290254 A:dgx (DE-101)04106612X (DE-588)4106612-1 (DE-588)17567-5 (DE-588b)17567-5 v:zg (DE-588c)4106612-1 v:zg DE-101 DE-101 r:DE-101 ger 0025 gnd1 XA-DXDE XA-DE-BE g gndgen gik gndspec g s f w z v o 2 43155 t:2009-09-17 23/ger Schöneberg Berlin Schöneberg Berlin Magistrat spio https://d-nb.info/standards/elementset/gnd#variantName r Spitzenorgan Spitzenorgan Magistrat Schöneberg, Berlin spio https://d-nb.info/standards/elementset/gnd#variantName r Spitzenorgan Spitzenorgan Berlin Kreis Teltow Stadt Schöneberg v:1898-1911 Berlin-Schöneberg 1912-1920 1874-1920 datb https://d-nb.info/standards/elementset/gnd#dateOfEstablishmentAndTermination r Zeitraum (DE-101)001169440 (DE-588)116944-0 https://d-nb.info/gnd/116944-0 Alt-Schöneberg vorg https://d-nb.info/standards/elementset/gnd#precedingPlaceOrGeographicName r Vorgaenger (DE-101)040873404 (DE-588)4087340-7 https://d-nb.info/gnd/4087340-7 Berlin-Schöneberg nach https://d-nb.info/standards/elementset/gnd#succeedingPlaceOrGeographicName r Nachfolger (DE-101)040057283 (DE-588)4005728-8 https://d-nb.info/gnd/4005728-8 Berlin obpa https://d-nb.info/standards/elementset/gnd#broaderTermPartitive r Oberbegriff partitiv X:1 MMi M Stand: 20.01.2022 https://de.wikipedia.org/wiki/Berlin-Sch%C3%B6neberg Stadt südlich von Berlin am Nordrand des Teltow, wahrscheinlich 1. Drittel 13. Jh. als Straßendorf gegründet, 1874 Zusammenschluss von Alt- und Neu-Schöneberg (um 1750 gegründet), 1.4.1898 Stadtrecht im Landkreis Teltow, 1.4.1899 kreisfrei, 1.4.1912 zum Zweckverband Groß-Berlin, seitdem Berlin-Schöneberg, 1.10.1920 als 11. Bezirk nach Berlin eingemeindet swd g Schöneberg <Berlin> (DE-588c)4106612-1 gkd a Schöneberg <Berlin> (DE-588b)17567-5

The qualifier in $g is also used for celestial bodies (https://d-nb.info/gnd/4179170-8):

<?xml version="1.0" encoding="UTF-8"?>

00000nz a2200000nc 4500 041791703 DE-101 20240308153540.0 880701n||azznnbabn | ana |c 4179170-8 http://d-nb.info/gnd/4179170-8 gnd (DE-101)041791703 (DE-588)4179170-8 (DE-588c)4179170-8 v:zg DE-101 DE-101 r:DE-101 ger 9999 gnd1 XN 20 sswd g gndgen gix gndspec g s w 2 9926 d:3 t:2007-01-01 23/ger 523.46 d:3 t:2007-01-01 23/ger 133.537 d:3 t:2015-07-03 23/ger Saturn Planet (DE-101)040462129 (DE-588)4046212-2 https://d-nb.info/gnd/4046212-2 Planet obin https://d-nb.info/standards/elementset/gnd#broaderTermInstantial r Oberbegriff instantiell X:1 SWL (DE-101)1134041160 (DLC)sh85117690 http://id.loc.gov/authorities/subjects/sh85117690 Saturn (Planet) EQ https://d-nb.info/standards/elementset/gnd#equivalence Aequivalenz lcsh L:eng (DE-101)1134041160 (FrPBN)FRBNF11976363 https://data.bnf.fr/ark:/12148/cb11976363g Saturne (planète) EQ https://d-nb.info/standards/elementset/gnd#equivalence Aequivalenz ram L:fre swd s Saturn <Planet> (DE-588c)4179170-8

Here's the GND record for Mississippi River, which in GND is established as Mississippi (Fluss) (https://d-nb.info/gnd/4039589-3https://d-nb.info/gnd/4039589-3😄

<?xml version="1.0" encoding="UTF-8"?>

00000nz a2200000nc 4500 040395898 DE-101 20230818063143.0 880701n||azznnbabn | ana |c 4039589-3 http://d-nb.info/gnd/4039589-3 gnd (DE-101)040395898 (DE-588)4039589-3 (DE-588c)4039589-3 v:zg DE-101 DE-101 r:DE-101 ger 9999 gnd1 XD-US 19.3 sswd g gndgen gin gndspec g s w o 2 77 t:2007-01-01 23/ger Mississippi Fluss (DE-101)041319729 (DE-588)4131972-2 https://d-nb.info/gnd/4131972-2 Fluss obin https://d-nb.info/standards/elementset/gnd#broaderTermInstantial r Oberbegriff instantiell X:1 M (DE-101)992345448 (ZBW)091412544 Mississippi (Fluss) =EQ https://d-nb.info/standards/elementset/gnd#exactEquivalence exakte Aequivalenz stw swd g Mississippi <Fluss> (DE-588c)4039589-3

In all cases that I've seen, what's in $g is rendered in a label as a qualifier in parentheses:

East End (London)

Schöneberg (Berlin)

Saturn (Planet)

Mississippi (Fluss)

If you drop these qualifiers, the term is no longer clear. Does "Saturn" refer to a planet, a deity, or an automobile model? Does "Mississippi" refer to a state or a river?

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: Cypress Payne @.> Sent: Monday, November 11, 2024 11:54 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 651 subject added entry--geographic name (Issue #246)

I haven't found any with subfield $g that aren't from gnd, but I found a few more examples from gnd:

    <marc:datafield tag="651" ind1=" " ind2="7">
        <marc:subfield code="a">Südafrika</marc:subfield>
        <marc:subfield code="g">Kontinent</marc:subfield>
        <marc:subfield code="2">gnd</marc:subfield>
    </marc:datafield>
    <marc:datafield tag="651" ind1=" " ind2="7">
        <marc:subfield code="a">Griechenland</marc:subfield>
        <marc:subfield code="g">Altertum</marc:subfield>
        <marc:subfield code="2">gnd</marc:subfield>
    </marc:datafield>
    <marc:datafield tag="651" ind1=" " ind2="7">
        <marc:subfield code="a">East End</marc:subfield>
        <marc:subfield code="g">London</marc:subfield>
        <marc:subfield code="2">gnd</marc:subfield>
    </marc:datafield>

@AdamSchiffhttps://urldefense.com/v3/__https://github.com/AdamSchiff__;!!K-Hz7m0Vt54!kbYtb7EmvtuwH_i121vG7pyd6QelTDUt5x4C_Tsn0-gPNBjRROepx1r-ewJEZQeouJfeHUdlJTrqv4h1eCQ3vRM$ do you know if $g is commonly used as part of the access point for gnd? If Adam doesn't know then I agree with Gordon, we should just use $a at this point.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/issues/246*issuecomment-2468920399__;Iw!!K-Hz7m0Vt54!kbYtb7EmvtuwH_i121vG7pyd6QelTDUt5x4C_Tsn0-gPNBjRROepx1r-ewJEZQeouJfeHUdlJTrqv4h1UBgZS_g$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVBZFG5PXHQA3W2ZNID32AEDOPAVCNFSM6AAAAABPU77WZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRYHEZDAMZZHE__;!!K-Hz7m0Vt54!kbYtb7EmvtuwH_i121vG7pyd6QelTDUt5x4C_Tsn0-gPNBjRROepx1r-ewJEZQeouJfeHUdlJTrqv4h1bUPPPDY$. You are receiving this because you were mentioned.

GordonDunsire commented 3 weeks ago

@cspayne, @AdamSchiff: So it looks like a systematic import/matching error: The GND data should have been concatenated and the qualifiers put in parentheses before importing to 651 subfield $a. I assume the transform can rectify this by detecting the source as GND and concatenating subfield $a and subfield $g (with added parentheses to conform to authority norms).

cspayne commented 2 weeks ago

Don't know. We should be consistent and mint a nomen if there is source so we can preserve the source information.

I also think we should mint a nomen if there is a source. As we discussed yesterday, we cannot determine whether an access point is authorized based on the $1 value. So if there is an approved source with $1 shouldn't we mint a nomen for that authorized access point to retain the source information? If there is no approved source and a $1, that is when we do not mint a nomen.

GordonDunsire commented 1 week ago

@cspayne: I think we resolved these cases via our slides?

cspayne commented 1 week ago

@GordonDunsire yes! We did :)