hbz / lobid-resources

Transformation, web frontend, and API for the hbz catalog as LOD
http://lobid.org/resources
Eclipse Public License 2.0
7 stars 7 forks source link

Provide "subjectOf" as BEACON ? #1987

Open dr0i opened 2 months ago

dr0i commented 2 months ago

Via email from S.H. from 15.04.24:

Bei der Gelegenheit möchte ich auch gleich noch eine BEACON-Datei anregen, die ebenso themenbasiert (subjectOf) zum hbz-Verbundkatalog verlinkt. Für diesen gibt es ja bisher nur die personen/organisationsbasierte (contributor) BEACON https://lobid.org/download/beacons/hbzlod-pndbeacon.txt

Analogue to https://github.com/hbz/nwbib/issues/646 : providing a subjectOf list as Beacon. (This may generate too much work to do "by the way" as it would mean not only to iterate over 0.5M records but 26M.)

dr0i commented 2 months ago

Maybe it is sufficient to query the list of interest by using the API like this:

subject.componentList.id:"https://d-nb.info/gnd/4066009-6" OR subject.componentList.id:"https://d-nb.info/gnd/4004744-1"

You can "OR" as many GND-subjects (or DDC etc) as you want (note that it may be necessary to POST instead of GET since the length of a URL has a restriction).

Get all the data like this:

curl --header "Accept-Encoding: gzip" 'http://lobid.org/resources/search?q=subject.componentList.id%3A%22https%3A%2F%2Fd-nb.info%2Fgnd%2F4066009-6%22+OR++subject.componentList.id%3A%22https%3A%2F%2Fd-nb.info%2Fgnd%2F4004744-1%22&format=jsonl' > architektur.jsonl.gz

(more API tricks e.g. in http://lobid.org/resources/api#bulk_downloads) Then gunzip architektur.jsonl.gz. Get all lobid/alma-IDs like this: jq .almaMmsId architektur.jsonl

I will asked S.H. if this would be feasible.