lobid / lodmill

This repo is replaced by i.a. https://github.com/hbz/lobid-resources/
19 stars 8 forks source link

Add links to work entities for RDA records #784

Closed acka47 closed 6 years ago

acka47 commented 8 years ago

Sub-issue of https://github.com/hbz/lobid/issues/161.

See https://wiki.dnb.de/display/DINIAGKIM/Modellierung+Werk. First, I will have to look for example entries in the hbz01 data. Then we will have to implement this.

@dr0i Can you please create a new file with the RDA entries from hbz01? It isn't urgent, though...

acka47 commented 8 years ago

@dr0i Can you please create a new file with the RDA entries from hbz01? It isn't urgent, though...

I mean an update of the file you provided in https://github.com/hbz/lobid/issues/161#issuecomment-153293572.

dr0i commented 8 years ago

Created an updated list (see https://github.com/hbz/lobid/issues/161#issuecomment-173938952).

acka47 commented 8 years ago

As far as I can see, the data for work is manifested is in field 303.

Most of the work titles don't seem to have a link to GND, e.g. HT018848722:

          <datafield tag="303" ind1="-" ind2="1">
            <subfield code="p">Frank, Dietmar</subfield>
            <subfield code="t">Traumland Nepal &lt;eng&gt;</subfield>
          </datafield>

HT018847832 is an example for a work title with GND URI. Source (snippet):

          <datafield tag="303" ind1="-" ind2="1">
            <subfield code="p">Dharmasenagaṅi Mahattara</subfield>
            <subfield code="d">ca. 7. Jh.</subfield>
            <subfield code="t">Vasudevahiṃdī-majjhima-khaṃda</subfield>
            <subfield code="9">(DE-588)4792592-9</subfield>
          </datafield>
acka47 commented 8 years ago

Further examples:

          <datafield tag="303" ind1="-" ind2="1">
            <subfield code="t">&lt;&lt;The&gt;&gt; killers</subfield>
            <subfield code="h">Film</subfield>
            <subfield code="f">1946</subfield>
          </datafield>
acka47 commented 8 years ago

HT018853619 is also interesting, using subfields u and z:

          <datafield tag="303" ind1="-" ind2="1">
            <subfield code="p">Mahler, Gustav</subfield>
            <subfield code="d">1860-1911</subfield>
            <subfield code="t">Lieder</subfield>
            <subfield code="m">Singst.</subfield>
            <subfield code="m">Orch</subfield>
            <subfield code="f">1905</subfield>
            <subfield code="u">Ich bin der Welt abhanden gekommen</subfield>
            <subfield code="9">(DE-588)300097921</subfield>
            <subfield code="z">Arr.</subfield>
          </datafield>
acka47 commented 8 years ago

I got the cataloging help for field 303 from our colleague Stephanie:

      303       BEVORZUGTER TITEL DES WERKES (W, FAKULTATIV, ALLE SATZTYPEN)  -  RDA

               Bei Feld 303 Indikator blank werden die Unterfelder $p und $d automatisch aus Feld 100 Indikator blank generiert.
               Bei vorhandenem Feld 200 Indikator blank werden die betreffenden Unterfelder für Körperschaften in Feld 303 Indikator blank generiert.

             Regelwerksstelle im RDA-Toolkit zum Werktitel
             Regelwerksstelle im RDA-Toolkit zum normierten Sucheinstieg

             Indikator:
             blank = Bevorzugter Titel des Werkes (NW, fakultativ)
             t  = In der Manifestation verkörperte Werke/In Beziehung stehende Werke (W, fakultativ)

             Unterfelder:

             Unterfelder für Personen:
             p  = Person, bevorzugter Name (strukturiert, GND) (NW, fakultativ)
             n  = Zählung (NW, fakultativ)
             c  = Beiname, Gattungsname, Territorium, Titulatur (NW, fakultativ)
             d  = Lebens- / Wirkungsdaten (NW, fakultativ)

             Unterfelder für Körperschaften:
             k  = Körperschaft, bevorzugter Name (strukturiert, GND) (NW, fakultativ)
             b  = Untergeordnete Körperschaft (W, fakultativ)
             x  = mehrgliedrige Benennung (W, fakultativ)
             h  = Zusatz (W, fakultativ)

             Unterfelder für Veranstaltungen:
             e  = Veranstaltung, bevorzugter Name (strukturiert, GND) (NW, fakultativ)
             h  = Veranstaltung, Zusatz (W, fakultativ)
             n  = Veranstaltung, Zählung (W, fakultativ)
             d  = Veranstaltung, Datum (NW, fakultativ)
             c  = Veranstaltung, Ort (NW, fakultativ)

             Unterfelder für Geografika:
             g  = Geografikum, bevorzugter Name (strukturiert, GND) (NW, fakultativ)
             b  = Untergeordnete Einheit (W, fakultativ)
             x  = mehrgliedrige Benennung (W, fakultativ)
             h  = Zusatz (W, fakultativ)

             Unterfelder für Werktitel:
             t  = Titel (NW, obligatorisch) [Auswahlliste Strg+F8]
             h  = Zusatz (W, fakultativ)
             m  = Besetzung (W, fakultativ)
             n  = Zählung (W, fakultativ)
             o  = Angabe eines Musikarrangements (NW, fakultativ)
             u  = Titel eines Teils/einer Abteilung eines Werkes (W, fakultativ)
             r  = Tonart (W, fakultativ)
             s  = Version (W, fakultativ)
             x  = mehrgliedrige Benennung (W, fakultativ)
             v  = Bemerkungen (W, fakultativ)
             z  = Bezeichnungen, Teilausgabe, Gattung (W, fakultativ)
             f  = Erscheinungsjahr eines Werkes (NW, fakultativ)

             Allgemeine Unterfelder:
             9  = GND-Identifikationsnummer (NW, fakultativ)
             Z  = Zuordnung zum originalschriftlichen Feld (NW, fakultativ)

             a  = Name (unstrukturiert) (NW, fakultativ)  --- wird nicht aktiv erfasst
             E  = Veranstaltung, Unterordnung (W, fakultativ)  --- wird nicht aktiv erfasst
             i  = Relationsterm (W, fakultativ)  --- wird nicht aktiv erfasst
             H  = Medium (W, fakultativ)  --- wird nicht aktiv erfasst
             U  = Zugehörigkeit (NW, fakultativ)  --- wird nicht aktiv erfasst
             X  = ISSN (NW, fakultativ)  --- wird nicht aktiv erfasst
             3  = Spezifische Materialangaben (NW, fakultativ)  --- wird nicht aktiv erfasst
             4  = Beziehungscode (W, fakultativ)  --- wird nicht aktiv erfasst
             8  = Feldverknüpfung und Reihenfolge (W, fakultativ)  --- wird nicht aktiv erfasst

             ----- in RDA-Aufnahmen nicht verwenden: -----

             Indikator:
             b     = Werktitel - beigefügte Werke
             e     = Werktitel - enthaltene Werke
             n     = Werktitel - nicht spezifiziert

             Beispiele:

             In der Manifestation verkörpertes Werk mit GND-Verknüpfung
             303t  $p Kafka, Franz  $d 1883-1942  $t <<Der>> Prozess  $9 (DE-588)4099250-0

             In der Manifestation verkörpertes Werk von einem amerikanischem Schriftsteller mit GND-Verknüpfung
             303t  $p Steinbeck, John  $d 1902-1968  $t  <<The>> winter of our discontent  $9 (DE-588)4791133-5

             In der Manifestation verkörpertes Werk ohne GND-Verknüpfung
             303t  $p Dirschedl, Uta  $t <<Die>> griechischen Säulenbasen

             Zusammenstellung ohne übergeordnetem Titel (ein geistiger Schöpfer)
             100   $p Clark, Mary Higgins  $d 1929-  $9 (DE-588)119451832
             303t  $p Clark, Mary Higgins  $d 1929-  $t I’l be seeing you  $9 (DE-588)…
             303t  $p Clark, Mary Higgins  $d 1929-  $t All around the town  $9 (DE-588)…
             331   $a <<Das>> fremde Gesicht

             Zusammenstellung ohne übergeordnetem Titel (mehrere geistige Schöpfer)
             100b  $p Geisler, Dagmar  $d 1958-  $4 aut  $9 (DE-588)115558195
             104b  $p Arold, Marliese  $d 1958-  $4 aut  $9 (DE-588)112060234
             303t  $p Geisler, Dagmar  $d 1958-  $t Kleine Hexengeschichten   $9 (DE-588)…
             303t  $p Arold, Marliese  $d 1958-  $t Kleine Vampirgeschichten  $9 (DE-588)…
             331   $a Kleine Hexengeschichten

             Musikwerk
             303_  $p Bach, Johann Sebastian  $d 1685-1750  $t Sonaten  $m Fl  $m Bc  $n BMV 1033 - 1035  $u Sonate  $n BMV 1035
                        $9 (DE-588)300010907  $z Arr.

             Fortlaufende Ressourcen mit unterscheidendem Merkmal
             303   $t Report  $h Society for Human Resource (New York, NY)
             303   $t Report  $h Society for Human Resource (Los Angeles, Calif.)
ChristophEwertowski commented 7 years ago

I transformed most of the examples to ttl (the "Zupforchester" example I omitted because of redundancy to the example before). When possible I used the properties of the context.json. Especially difficult to express was the circumstance that a musical work needs a special "Besetzung" (last example).


@prefix bibo: <http://purl.org/ontology/bibo/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix gnd: <http://d-nb.info/standards/elementset/gnd#> .
@prefix lv: <http://purl.org/lobid/lv#> .
@prefix mo: <http://purl.org/ontology/mo/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

<http://lobid.org/resources/HT018848722> bf:contribution _:y .
_:y bf:agent _:z .
_:z rdfs:label "Frank, Dietmar" ;
  rdf:type gnd:Person .
_:y bf:role <http://id.loc.gov/vocabulary/relators/cre> .
<http://id.loc.gov/vocabulary/relators/cre> rdfs:label "Autor/in" . 
_:y rdf:type bf:Contribution .
<http://lobid.org/resources/HT018848722> dct:alternative "Traumland Nepal" ;  
  dct:language "eng" . 

<http://lobid.org/resources/HT018847832> bf:contribution _:y .
_:y bf:agent <http://d-nb.info/gnd/12018432X> .
<http://d-nb.info/gnd/12018432X> skos:altLabel "Mahattara, Dharmasenagaṇi",
    "Dharmaseṇa Gaṇi, Mahattara" ;
  rdfs:label "Dharmasenagaṅi Mahattara" ;
  rdf:type gnd:Person .
_:y bf:role <http://id.loc.gov/vocabulary/relators/cre> .
<http://id.loc.gov/vocabulary/relators/cre> rdfs:label "Autor/in" .
<http://lobid.org/resources/HT018847832> dct:title "Vasudevahiṃdī-madhyama-khaṇḍa" ;
  dct:description "ca. 7. Jh." ;
  lv:contributorLabel "Ramanik Shah", "Saha, Ra. Ma.", "Śāha, Mayaṅka Āra.", 
    "Shah, R. M.", "Śāha, Ramaṇīka", "R. Mayaṅka Śāha", "Śāha, Ra. Ma.", 
    "Shah, Ramanik", "Mahattara, Dharmasenagaṇi", "Dharmaseṇa Gaṇi", 
    "Śāha, Ramaṇīka Ma.", "Mayaṅka Āra. Śāha", "Dharmasenagaṅi Mahattara", 
    "Ramaṇīka Ma. Śāha", "Śāha, R. Mayaṅka", "Shah, Ramanik M.";
  <http://www.rdaregistry.info/Elements/u/P60339> "Śrī Dharmasena-gaṇi-mahattara viracita prākṛta-bhāṣābaddha ; saṃpādaka Ḍā. Ramaṇīka Śāha" ;
  rdfs:seeAlso <http://d-nb.info/gnd/4792592-9> .

#Same problem as above. Exact same title appears in 303 and 331. 
#If different string, add it an alternativeTitle.

<http://lobid.org/resources/HT018848736> bf:contribution _:y .
_:y bf:agent <http://d-nb.info/gnd/120192691> .
<http://d-nb.info/gnd/120192691> skos:altLabel "Yaśodevasūrī",
    "Yashovijayaji, Upadhyayaji", "Yaśovijaya, Gaṇi", "Yaśovijaya Sūri", 
    "Yashovijayopadhyaya", "Yaśovijaya, Sūri", "Yaśovijayagaṇi", 
    "Yashovijay Upadhyay", "Yaśovijayaj", "Yaśovijayasūri",
    "Yaśovijaya Upādhyāya", "Yaśovijayopādhyāya", "Yaśovijayajī, Mahopādhyāya" ;
    rdfs:label "Yaśovijaya" ;
    rdf:type gnd:Person .
_:y bf:agent <http://d-nb.info/gnd/1080816232> .
# ...
<http://lobid.org/resources/HT018848736> dct:title "Dravya-guṇa-paryāya kā rāsa" .

#Alternative names. Doesn't contain multiple works.

<http://lobid.org/resources/HT018852614> bf:contribution _:y .
_:y bf:agent <http://d-nb.info/gnd/115364005> .
<http://d-nb.info/gnd/115364005> skos:altLabel "Hog, Peṭer",
    "Hoeg, Peter", "הוג, פטר","Höeg, Peter" ;
    rdfs:label "Høeg, Peter" ;
    rdf:type gnd:Person .
_:y bf:relator <http://id.loc.gov/vocabulary/relators/cre> .
<http://id.loc.gov/vocabulary/relators/cre> rdfs:label "Autor/in" .
_:y rdf:type bf:Contribution .
<http://lobid.org/resources/HT018852614> dct:title "Smilla et l'amour de la neige" ;
  dct:language <http://id.loc.gov/vocabulary/iso639-2/fra> .
<http://id.loc.gov/vocabulary/iso639-2/fra> rdfs:label "Französisch" .

<http://lobid.org/resource/HT018853244> lv:volumeIn _:y .
_:y lv:multiVolumeWork <http://lobid.org/resources/HT001342599> ;
  lv:numbering "34,35" ;
  rdf:type lv:MultiVolumeWorkRelation .

  <http://lobid.org/resources/HT018852641> dct:title "36 caprices in all major and minor keys" ;
    dct:alternative "Capriccios" ;
    mo:Instrument <http://d-nb.info/gnd/4124933-1> ;
    mo:opus ; #or lv:numbering "op. 20"
    <http://www.w3.org/2002/07/owl#sameAs> <http://d-nb.info/gnd/300090544> ;
    bf:contribution _:y .
  _:y bf:agent <http://d-nb.info/gnd/116861002> .
  <http://d-nb.info/gnd/116861002> skos:altLabel "Legnani, Luigi R." ,
    "Legnani, Luigi Rinaldo" ;
    rdfs:label "Legnani, Luigi" ;
    rdf:type gnd:Person ;
    gnd:dateOfBirth "1790" ; # Lebensdaten and Wirkdaten are in one subfield!
    gnd:dateOfDeath "1877" . # Lebensdaten and Wirkdaten are in one subfield!

<http://lobid.org/resources/HT018856101> dct:alternative "<<The>> killers" ; # *the* is added as marked by < and > in source
  dct:medium "Film" ; #in field 'Zusatz' maybe use description?
  dct:created "1946" .

<http://lobid.org/resources/HT018853619> dct:title "Lieder" ;
  dct:created "1905" ;
  dct:hasPart _:x . # dct documentation: 'term is intended to be used with non-literal values'
_:x dct:title "Ich bin der Welt abhanden gekommen" ;
  rdfs:seeAlso <http://d-nb.info/gnd/300097921> ;
  dbf:contribution _:y .
_:y bf:agent <http://d-nb.info/gnd/118576291> ;
  bf:agent _:a ;
  bf:agent _:b .
_:a rdf:type gnd:Person ;
  mo:primary_instrument <http://d-nb.info/gnd/4156941-6> .
_:b mo:primary_instrument <http://d-nb.info/gnd/4172708-3> .
<http://d-nb.info/gnd/118576291> gnd:preferredNameForThePerson "Mahler, Gustav" ;
  gnd:dateOfBirth "1860" ; # Lebensdaten and Wirkdaten are in one subfield!
  gnd:dateOfDeath "1911" . # Lebensdaten and Wirkdaten are in one subfield! ```
acka47 commented 7 years ago

Going through the examples, I realize that I obviously haven't been clear on what we are trying to achieve with this issue. The central thing is that both we and the Titeldaten working group want to replace the current approach (adding Einheitssachtitel as dct:alternativeTitle) by another approach that enables putting more information (like URI of the work etc.) on the described work. At https://wiki.dnb.de/display/DINIAGKIM/Modellierung+Werk we collected possible approaches. What we have to do know is think about chosing an approach for lobid that covers the different use cases. I think, what is most important to retain in the RDF are the actual title string and the workd ID if existing.

acka47 commented 7 years ago

Here is how the first two examples could look like:

{
   "@context":{
      "id":"@id",
      "title":"http://purl.org/dc/terms/title",
      "exampleOfWork":"http://schema.org/exampleOfWork"
   },
   "id":"http://lobid.org/resources/HT018848722#!",
   "exampleOfWork":{
      "title":"Traumland Nepal &lt;eng&gt;"
   }
}
{
   "@context":{
      "id":"@id",
      "title":"http://purl.org/dc/terms/title",
      "exampleOfWork":"http://schema.org/exampleOfWork"
   },
   "id":"http://lobid.org/resources/HT018848722#!",
   "exampleOfWork":{
      "id":"http://d-nb.info/gnd/4792592-9",
      "title":"Vasudevahiṃdī-majjhima-khaṃda"
   }
}
acka47 commented 7 years ago

There are some questions to be considered:

ChristophEwertowski commented 7 years ago
  1. title and id probably aren't enough. I'm thinking about "Jahresberichte" which are often grey literature and thus in some cases won't get to the DNB (because of the missing Pflichtexemplarrecht). Then it's up the catalogers in the libraries to use the right norm data entry for the author/issuing institution, e.g. creator.
ChristophEwertowski commented 7 years ago

Also I don't know how many works are without an id of some sort.

  1. The question is if someone is or would be searching for the instrumentation in the hbz catalog. Maybe ask a music librarian.
ChristophEwertowski commented 7 years ago

For many Partituren the instrument is already named in the title (see search). If not there are multiple subjects which could fit but it's cumbersome to select the fitting.

Looked for "Sopran" as a subject keyword in the KVK. It's differently handled dependent on which catalog you choose. DNB: In subject field (probably a line for each subject order). BVB: In subject field. At "Mehr zum Titel" of the example you can see that if they have two subject chains, they put them together and sort them alphabetically. GBV: In subject field (order not discernable). "Besetzung" field (example: PPN field 870811207). HBZ: You get hits but no clue where the subject comes from (Aleph data which you can't access in the old catalog). HEBIS: In subject field (order not discernable). Also has own "Besetzung" field (for example
978-1-4950-5821-9). KOBV: In subject field (order not discernable). SWB: DDC-number for Sopran but no label is shown.

At the moment you don't really know where to look for instrumentation. It's in many fields and sometimes only in the Aleph data. Some resources have the MAB field filled out, some not. I would approve taking instrumentation into the resources. But also simply searching by URL would be a great improvement. I will get in touch with some colleagues of mine and ask them if they ever searched for instrumentation or rather what they searched for in the specific course at the university.

acka47 commented 7 years ago

When implementing something we will also have to adjust the transformation of field 304 "Einheitssachtitel" which is currently mapped to dct:alternative, see current morph line 546.

ChristophEwertowski commented 7 years ago

Some of this is handled in https://github.com/hbz/lobid-resources/issues/516.

dr0i commented 6 years ago

As hbz/lobid-resources#567 is resolved this issue is ready to be worked on. Pinging @ChristophEwertowski and @acka47 .

ChristophEwertowski commented 6 years ago

Work entities with GND-id are covered by https://github.com/hbz/lobid-resources/issues/567. But there are some which don't have an id like: HT018848722 (see MAB 303).

ChristophEwertowski commented 6 years ago

Changes deployed to test. See HT018848722 (production, json) / HT018848722 (test, json) for the original example (person without living statistics and title of work), HT018848736 (production, json) / HT018848736 (test, json) (person with living statistics) and HT018853619 (production, json) / HT018853619 (test, json) (title with subtitle, occupation and person with living statistics; the same as in GND).

acka47 commented 6 years ago

Very good. +1