netwerk-digitaal-erfgoed / network-of-terms

Search engine for finding terms in terminology sources (such as thesauri, classification systems and reference lists)
https://termennetwerk-api.netwerkdigitaalerfgoed.nl
European Union Public License 1.2
17 stars 3 forks source link

Add BAG #695

Open ddeboer opened 2 years ago

ddeboer commented 2 years ago
ddeboer commented 8 months ago

The idea is that we want to search street, number and place name.

@rschalkrce has produced a pure SPARQL query that unfortunately is not performance enough. Can you paste your query here?

I think we need a fulltext search index here.

@pmaria Do you happen to know if Kadaster offers such an index?

rschalkrce commented 8 months ago

The query is something like below. It works well if you exclude the ?huisnummer from ?searchfield but that is not a viable search strategy for addresses.

PREFIX nen3610: <https://data.kkg.kadaster.nl/nen3610/model/def/>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix kad: <https://data.kkg.kadaster.nl/kad/model/def/>
prefix sor: <https://data.kkg.kadaster.nl/sor/model/def/>
prefix bag: <http://bag.basisregistraties.overheid.nl/def/bag#>

CONSTRUCT {

  ?verblijfsobject
    a skos:Concept ;
    skos:prefLabel ?label ;
    skos:altLabel ?gebouwId .
}

WHERE {
    ?ruimte
        a sor:OpenbareRuimte ;
        skos:prefLabel ?straat_label .
    ?ruimte
    sor:ligtIn/skos:prefLabel ?plaats_label .

  ?nummeraanduiding
    sor:ligtAan ?ruimte ;
    ^sor:hoofdadres ?verblijfsobject ;
    sor:postcode ?postcode ;
    sor:huisnummer ?huisnummer .
  optional { ?nummeraanduiding sor:huisnummertoevoeging ?huisnummer_toevoeging }

  ?verblijfsobject
    sor:maaktDeelUitVan ?gebouw .

  FILTER(CONTAINS(LCASE(str(?searchfield)), LCASE(str("engweg 6 bunnik")))) # string vervangen door zoekquery gebruiker Termennetwerk
  BIND(LCASE(CONCAT(STR(?straat_label)," ",STR(?huisnummer), " ", STR(?plaats_label))) as ?searchfield)

  BIND(strafter(str(?gebouw), "gebouw/") as ?gebouwId)
  BIND(CONCAT(STR(?straat_label), " ", STR(?huisnummer),IF(BOUND(?huisnummer_toevoeging), ?huisnummer_toevoeging, ""), " ", STR(?postcode), " ", STR(?plaats_label)) as ?label)

} LIMIT 1000
pmaria commented 8 months ago

You could perhaps use the PDOK locatieserver: https://www.pdok.nl/pdok-locatieserver

Could you elaborate a bit more on the use case?

rschalkrce commented 8 months ago

@pmaria the use case is the add Kadaster addresses (?verblijfsobject) to the Network of Terms, so that these URI's can be used to geolocate/identify objects in digital heritage collections. The Network of Terms can only retrieve URI's by accessing endpoints like the Kadaster through SPARQL queries, so I think your proposed solution will not work.

Do you perhaps know if the Kadaster SPARQL endpoint supports text indexing?

ddeboer commented 8 months ago

The Kadaster SPARQL endpoint runs on Triply (Speedy).

rschalkrce commented 8 months ago

I am aware of that, but perhaps there was some built-in indexing we were not aware of ;) I have been in contact with Kadaster and they will reach out to Triply about this