hbz / lobid-resources

Transformation, web frontend, and API for the hbz catalog as LOD
http://lobid.org/resources
Eclipse Public License 2.0
8 stars 7 forks source link

Remodel contributor roles (ALMA) #1248

Closed TobiasNx closed 3 years ago

TobiasNx commented 3 years ago

Concerning our ALMA-Transformation: At the moment organisations as contributors always have the role "Mitwirkende" but "Autor" (author) should be possible too.

https://github.com/hbz/lobid-resources/blob/2d136786157b1a3f4b6837f980a4a5b33d9cc576/src/main/resources/alma/common/contribution.xml#L113-L120

TobiasNx commented 3 years ago

To not loose the discussion in the not used PR: #1255

https://github.com/hbz/lobid-resources/pull/1255#issuecomment-833313750:

I'll try to answer your questions here:

Are the code always marc relators?

I am quite sure that they are.

How do we handle contributors with more than one role? See HT019246898 We need an array perhaps.

See e.g. #933 (comment) that links to more sources. I'll repeat what I wrote in #868 (comment) here:

In 2016 I asked on the Bibframe list how to record multiple contributions by the same person. Ray Denenberg of LoC responded:

If you want to list two roles, e.g. “illustrator” and “author” for the same person, the rule is, declare two separate Contribution resources. Two or more role statements within the same Contribution resource should occur only because you want to provide two different representations (e.g. two different URIs) for the same role.

How do we provide labels, is there a mapping list?

The labels for MARC relators and many other controlled terms are configured in the labels.json.

and https://github.com/hbz/lobid-resources/pull/1255#issuecomment-833557636

Concerning the multiple roles that seems to be difficult to me to model, since the roles appear in a single field 100/110/700/710 as multiple subfields: e.g. HT019246898

  <datafield tag="700" ind1="1" ind2=" ">
    <subfield code="a">Abel, Alexandra</subfield>
    <subfield code="0">(DE-588)1069204668</subfield>
    <subfield code="4">edt</subfield>
    <subfield code="4">aut</subfield>
    <subfield code="0">(uri) https://portal.dnb.de/opac.htm?method=simpleSearch&amp;cqlMode=true&amp;query=idn=1069204668</subfield>
    <subfield code="0">(uri) http://viaf.org/viaf/sourceID/DNB|1069204668</subfield>
    <subfield code="B">GND-1069204668</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
    <subfield code="a">Rudolf, Bernd</subfield>
    <subfield code="0">(DE-588)1038489679</subfield>
    <subfield code="4">edt</subfield>
    <subfield code="4">aut</subfield>
    <subfield code="0">(uri) https://portal.dnb.de/opac.htm?method=simpleSearch&amp;cqlMode=true&amp;query=idn=1038489679</subfield>
    <subfield code="0">(uri) http://viaf.org/viaf/sourceID/DNB|1038489679</subfield>
    <subfield code="B">GND-1038489679</subfield>
  </datafield>

and https://github.com/hbz/lobid-resources/pull/1255#issuecomment-833569272

Concerning the multiple roles that seems to be difficult to me to model, since the roles appear in a single field 100/110/700/710 as multiple subfields:

Isn't this analogous to the Aleph XML data that we already process for lobid-resources where also different relator codes are repeated in subfield 4? See e.g. HT018857620 (snippet):

<datafield tag="100" ind1="b" ind2="1">
  <subfield code="p">Barenboim, Daniel</subfield>
  <subfield code="d">1942-</subfield>
  <subfield code="4">cnd</subfield>
  <subfield code="3">Dirigent</subfield>
  <subfield code="4">drt</subfield>
  <subfield code="3">Regisseur</subfield>
  <subfield code="9">(DE-588)118506560</subfield>
</datafield>

Maybe a look into the hbz01-to-lobid morph makes sense. #881 is the PR in which the transformation of multiple roles for one agent was implemented. Here is the diff in the morph: https://github.com/hbz/lobid-resources/pull/881/files#diff-c79e89b206237040d700b790040652e320068ef3301084a46eeee16cb715d952

TobiasNx commented 3 years ago

@dr0i has introduced the etikett-maker and I have separated the marcRel-labels from the other labels. I also added not used marcRel - concepts that the DNB Standardisierungsausschuss named in https://wiki.dnb.de/download/attachments/106042227/AH-017.pdf

@acka47 Is there a reason why we did not use the marcrel: aut for authors but transformed all aut to cre (creator).

Concerning the roles we should have another look at the translated labels since I used a gendered version of the DNB translations and not the previous lobid translations. Perhaps we want to "move" back.

We should have a look at 100/110/700/710 contributors that have no role the Aleph Morph did not work with the realtor codes if i see correctly but with labels that were transformed to codes.

The approach to multiple roles for one contributor needs to be figured out separately. We should open a new issue for that.

dr0i commented 3 years ago

I think I am assigned accidentally. Assigning @acka47 .

acka47 commented 3 years ago

Is there a reason why we did not use the marcrel: aut for authors but transformed all aut to cre (creator).

This originates from https://github.com/hbz/lobid-resources/issues/72 (where we apparently hadn't decided yet to use bf:contribution) where I wrote:

Default entry (at least for person) is aut but for now we will continue using dct:creator.

So the reason might be that

  1. there was no property like author so we stick to dct:creator and modeled the data accordingly when movin to bf:contribution or
  2. aut was used as default in the data also for non-textual materials where it did not fit so that we sticked with creator.

Whatever the reason was, I don't see that we'd win a lot with using aut instead of cre as the difference only addresses a difference in the form of the described resource which we have indicated with type and medium. However, if you are more knowledgeable about the source data and see no harm in using aut, @TobiasNx, we could think about changing this.

TobiasNx commented 3 years ago

Within the test data I only find "aut" with textual publications, but that is only 20. For that we could set up a test index and check out, if it is true that "aut" is a default value. But as far as I know there is no written statement about the marcRel code and a default modus.

On the other hand, since we use marcRel we should reuse the source data of the Verbundskatalog. I therefore would vote for the "original" code and not transform it to "aut"

A more generic version of role types is also provided by the mapping RDA -> marcRel https://wiki.dnb.de/download/attachments/106042227/AH-017.pdf But this could someone as mapping.

But there are still three other questions, we should talk about them next week:

Concerning the roles we should have another look at the translated labels since I used a gendered version of the DNB translations and not the previous lobid translations. Perhaps we want to "move" back.

We should have a look at 100/110/700/710 contributors that have no role the Aleph Morph did not work with the realtor codes if i see correctly but with labels that were transformed to codes.

The approach to multiple roles for one contributor needs to be figured out separately. We should open a new issue for that.

TobiasNx commented 3 years ago

We have a sample index of ~500000 (http://test.alma.lobid.org/resources/search?q=*&format=json). In this sample we have 191179 records with "aut" but only 88 that are not associated with book or periodical: http://test.alma.lobid.org/resources/search?q=contribution.role.label:%22aut%22%20AND%20NOT%20type:%22Book%22%20AND%20NOT%20type:%22Periodical%22%20&format=json

Seems to be reasonable that we use "aut"

TobiasNx commented 3 years ago

We (@acka47 and I) decided off board that we keep that labels that are given by the Verbundskatalog since the catalogue data seems not to use "aut" for not textual publications.

TobiasNx commented 3 years ago

Hi @acka47, could you have a look if the labels are good like this?

dr0i commented 3 years ago

Don't know what to do here. Unassigning myself.

dr0i commented 3 years ago

Deployed, see e.g. https://alma.lobid.org/resources/search?q=HT000161712&format=json. Closed.