hbz / lobid-gnd

UI and API to the Integrated Authority File (Gemeinsame Normdatei, GND)
http://lobid.org/gnd
Eclipse Public License 2.0
24 stars 5 forks source link

Problems with sameAs ISNI links when they are only in GND source #230

Closed annainfo closed 4 years ago

annainfo commented 4 years ago

https://lobid.org/gnd: "Datenbasis sind die RDF-Version der GND (täglich aktualisiert) und EntityFacts (vierteljährlich aktualisiert)."

But not all data available in the RDF is displayed, e.g. https://lobid.org/gnd/1193757835 (https://web.archive.org/web/20190916114811/https://lobid.org/gnd/1193757835 [showing broken image symbol which is not seen on source]; https://web.archive.org/web/20190916114811im_/https://lobid.org/gnd/1193757835 [no broken image symbol and no archive.org navigation]) does not display the ISNI which is contained in the RDF: ""

acka47 commented 4 years ago

Thanks for the report. The problem is not that data is missing, as the information from the GND RDF is there, in the JSON:

{
   "sameAs":[
      {
         "id":"http://isni.org/isni/0000000004345912",
         "collection":{
            "id":"http://isni.org"
         }
      }
   ]
}

However, it is not displayed correctly in the HTML view, because the link to the thumbnail etc. is not enriched in the JSON-LD. I think we do this for VIAF, Wikipedia and maybe other links but it does not seem to work for ISNI. I updated the title of this issue.

annainfo commented 4 years ago

"The problem is not that data is missing," Exactly. The reported problem was named /home page - misleading description - "Datenbasis"/ - it is missing and not specified. "as the information from the GND RDF is there, in the JSON:" - that does not make it visible in the HTML view.

"the link to the thumbnail etc. is not enriched in the JSON-LD" - maybe avoid requirement for "enriching"?

annainfo commented 4 years ago

RE new title of issue: "Problems with sameAs ISNI links when they are only in GND source" - what is meant by "only in GND source" - isn't that the basic feature, displaying information from the GND?

acka47 commented 4 years ago

what is meant by "only in GND source" - isn't that the basic feature, displaying information from the GND?

Not only the GND but also EntityFacts, as the description you quote reads:

The data is based on the RDF version of the GND (updated daily) and EntityFacts (updated quarterly).

So we both have less information than in GND MARC21 because the RDF does not cover all and more because of added EntityFacts information.

Looking at our example entry 1193757835, we see that in the RDF from the GND (Turtle file), there is only one triple regarding ISNI:

<http://d-nb.info/gnd/1193757835> owl:sameAs <http://isni.org/isni/0000000004345912>

As we are processing the GND RDF and not the full MARC21 information and as the EntityFacts entry was not present in the last dump we processed, this is what we also have in lobid-gnd by now, so there is nothing missing at all and the description is accurate. We even have the link to ISNI in the HTML but as it is missing a label, it is not really useful. What you want is, an image and the name of the source which you will not find in the MARC21 from GND at all but in EntityFacts (see here). We already add this for ORCID links that are only in the GND RDF but not in EntityFacts (see #187) and I adjusted this issue to also do this for ISNI.

I hope you this helps to better understand what is going on. Generally, the whole situation with two data sources and one (GND RDF) being updated daily while the other (EntityFacts) is only updated every few months is not ideal and error prone (see only the list of sameAs-related issues) and we try to make the best out of it. The reason we provide this service at all is that we offer useful services over this data: a powerful API, a responsive and performant search interface and an OpenRefine reconciliation service. If you do not need this at all and if it is important to you to see all the information in the GND then you should stick to the DNB interface, OGND or else.

fsteeg commented 4 years ago

Deployed handling of missing collection details to test: https://test.lobid.org/gnd/1193757835

acka47 commented 4 years ago

+1 This will work for now. As discussed offline, we will first try to get a more general and future-proof solution to improve sameAs linking instead of adding data enrichment during transformation.

annainfo commented 4 years ago

FTR 1) "As we are processing the GND RDF and not the full MARC21 information and as the EntityFacts entry was not present in the last dump we processed, this is what we also have in lobid-gnd by now, so there is nothing missing at all and the description is accurate." - It was not, as the ISNI from the GND RDF was not displayed.

FTR 2) The web.archive.org URL for the version 20190916 now redirects to 20190917. Maybe the version used in the initial github issue description was deleted.