Add SPARQL backend - Githubissues

gbv / subjects-api

JSKOS Concept Occurrences Provider implementation

https://coli-conc.gbv.de/subjects/

MIT License

0 stars 0 forks source link

Add SPARQL backend #31

Open nichtich opened 1 year ago

nichtich commented 1 year ago

Add a SPARQL backend in addition to SQLite and PostgresSQL (#26). Loading the NTriples dump into Fuseki takes considerably more time and query might be slower as well, but if query performance is acceptable, a SPARQL backend may provide more flexible kind of queries, such as transitive inclusion of narrower concepts.

nichtich commented 1 year ago

See https://labs.onb.ac.at/de/tool/sparql/ for a public SPARQL endpoint (read-only) to experiment with: ANNO (historische Zeitungen) and AKON (historische Postkarten), e.g.

Occurrence:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT (COUNT(?doc) AS ?c) {
  ?doc dc:subject <http://d-nb.info/gnd/4062901-6> .
}

Co-Occurrence: none (each title seems to have only one of 42 subjects?):

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?subject (COUNT(DISTINCT ?doc) AS ?count) {
  ?doc dc:subject ?subject .
  FILTER(isIRI(?subject))
} GROUP BY ?subject ORDER BY DESC(?count)

nichtich commented 1 month ago

Unfortunately https://labs.onb.ac.at/de/tool/sparql/ does not include skos:inScheme and it uses http://purl.org/dc/elements/1.1/subject instead of http://purl.org/dc/terms/subject - the latter could be configured though.

nichtich commented 1 month ago

Things to further adjust:

Configure subject predicate (dc:subject vs dct:subject)
Record URIs are hard-coded to http://uri.gbv.de/document/opac-de-627:ppn:$, better allow arbitrary record URIs as well
Partial and full import have not been implemented yet