hbz / lobid

Linking Open Bibliographic Data
https://lobid.org/
Eclipse Public License 2.0
16 stars 4 forks source link

Incorrect order of occupations #419

Closed j4lib closed 9 months ago

j4lib commented 4 years ago

Hi lobid team,

we just noticed that the order in which the occupations appear in the JSON-LD does not correspond to the order in MARC XML or RDF XML. There also seems to be a difference between 'Beruf charakteristisch=berc' and everything else ('beru'), if I understood correctly. Example: https://lobid.org/gnd/1043940596 http://d-nb.info/gnd/1043940596

I name the example above, because that person complained very angry about how we could call him a director in first place and not author.

acka47 commented 4 years ago

There is no order in the RDF provided by DNB, see http://d-nb.info/gnd/1043940596/about/lds:

<https://d-nb.info/gnd/1043940596> gndo:professionOrOccupation 
<https://d-nb.info/gnd/4003982-1>, <https://d-nb.info/gnd/4140241-8>,
<https://d-nb.info/gnd/4052154-0>, <https://d-nb.info/gnd/4049050-6>,
 <https://d-nb.info/gnd/4294338-3> .

Thus, it comes out unstructured after passing it through the transformation process with RDF libraries. I will open an issue in the GND Jira.

acka47 commented 4 years ago

Here is the corresponding issue (login necessary): https://jira.dnb.de/browse/GND-144

acka47 commented 11 months ago

Update from the JIRA issue:

auch hier wurde die Änderung produktiv genommen. Siehe https://d-nb.info/gnd/1043940596/about/lds Dumps kommen Ende Oktober/Anfang November.

Here is the turtle, it uses rdf:Seq:

<https://d-nb.info/gnd/1043940596> gndo:professionOrOccupation _:node1hdlfd95jx83301688 .

_:node1hdlfd95jx83301688 a rdf:Seq;
  rdf:_1 <https://d-nb.info/gnd/4003982-1>;
  rdf:_2 <https://d-nb.info/gnd/4140241-8>;
  rdf:_3 <https://d-nb.info/gnd/4052154-0>;
  rdf:_4 <https://d-nb.info/gnd/4049050-6>;
  rdf:_5 <https://d-nb.info/gnd/4294338-3> .

We will have to take a look whether and – if yes – what modifications need to be made in lobid.

acka47 commented 11 months ago

With updates coming in that include the RF list, we already get alert mails " Alert GND: found not compacted field(s)!":

gnd_20221212.mappings.authority.properties.professionOrOccupation.properties.http://www.properties.w3.properties.org/1999/02/22-rdf-syntax-ns#_1.properties.id.type gnd_20221212.mappings.authority.properties.professionOrOccupation.properties.http://www.properties.w3.properties.org/1999/02/22-rdf-syntax-ns#_1.properties.label.type gnd_20221212.mappings.authority.properties.professionOrOccupation.properties.http://www.properties.w3.properties.org/1999/02/22-rdf-syntax-ns#_1.properties.label.fields.keyword.type gnd_20221212.mappings.authority.properties.professionOrOccupation.properties.http://www.properties.w3.properties.org/1999/02/22-rdf-syntax-ns#_10.properties.id.type ...

This might be solved by updating the context for professionOrOccupation with "@container": "@list". I am not sure, though, that it will work as it probably will depend on how the JSON-LD library we are using interprets the rdf:Se1 list approach DNB chose. (See for these questions this Stackoverflow question: https://stackoverflow.com/questions/44959817/how-to-represent-collection-of-alternatives-in-json-ld). We might just try it. I opened https://github.com/hbz/lobid-gnd/pull/358 for this.

acka47 commented 11 months ago

We have changed @container to @list in but unfortunately this does not seem to do the trick, see https://test.lobid.org/gnd/search?q=describedBy.dateModified%3A2023-11-02

It looks like we have to dive deeper into RDF lists (which exists in different manifestations, see https://stackoverflow.com/questions/44959817/how-to-represent-collection-of-alternatives-in-json-ld). GND RDF uses rdf:Seq, see https://github.com/hbz/lobid/issues/419#issuecomment-1757821937

acka47 commented 9 months ago

Fixed with indexing of new GND RDF dump, see #366. See https://lobid.org/gnd/1043940596. Closing.