dice-group / LIMES

Link Discovery Framework for Metric Spaces.
https://limes.demos.dice-research.org/
GNU Affero General Public License v3.0
126 stars 54 forks source link

How to compare two labels? #312

Open dersuchendee opened 1 year ago

dersuchendee commented 1 year ago

I think I'm not understanding how to compare two labels. I'm doing it in the context of comparing two rivers' labels and adding an owl:sameAs property to link wikidata identifier. I get a "malformedURL" error.

The file:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE LIMES SYSTEM "limes.dtd">
<LIMES>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2004/02/skos/core/</NAMESPACE>
        <LABEL>skos</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2000/01/rdf-schema/</NAMESPACE>
        <LABEL>rdfs</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/1999/02/22-rdf-syntax-ns/</NAMESPACE>
        <LABEL>rdf</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2002/07/owl/</NAMESPACE>
        <LABEL>owl</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>https://dati.isprambiente.it/ontology/top/</NAMESPACE>
        <LABEL>ispra-top</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://semweb.mmlab.be/ns/ql/</NAMESPACE>
        <LABEL>ql</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>https://w3id.org/whow/data/hydrography/</NAMESPACE>
        <LABEL>hydro</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/ns/r2rml/</NAMESPACE>
        <LABEL>rr</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://semweb.mmlab.be/ns/rml/</NAMESPACE>
        <LABEL>rml</LABEL>
    </PREFIX>
    <SOURCE>
        <ID>wikidata</ID>
        <ENDPOINT>C:\INSTALLER\linking-wikidata-output.ttl</ENDPOINT>
        <VAR>?riverLabel</VAR>
        <PAGESIZE>1000</PAGESIZE>
        <RESTRICTION>?river rdf:type ispra-top:UniqueIdentifier</RESTRICTION>
        <PROPERTY>rdfs:label AS nolang->lowercase RENAME label</PROPERTY>
        <TYPE>TURTLE</TYPE>
    </SOURCE>
    <TARGET>
        <ID>hydro</ID>
        <ENDPOINT>C:\INSTALLER\hydromappingrivers-output.ttl</ENDPOINT>
        <VAR>?riverLabel2</VAR>
        <PAGESIZE>1000</PAGESIZE>
        <RESTRICTION>?river2 rdf:type hydro:RiverWaterBody</RESTRICTION>
        <PROPERTY>rdfs:label AS nolang->lowercase RENAME label</PROPERTY>
    </TARGET>
    <METRIC>Trigram(river.label,river2.label)</METRIC>
    <ACCEPTANCE>
        <THRESHOLD>0.3</THRESHOLD>
        <FILE>accepted_results.nt</FILE>
        <RELATION>owl:sameAs</RELATION>
    </ACCEPTANCE>
    <REVIEW>
        <THRESHOLD>0.5</THRESHOLD>
        <FILE>trial_results.txt</FILE>
        <RELATION>owl:sameAs</RELATION>
    </REVIEW>

    <EXECUTION>
        <REWRITER>default</REWRITER>
        <PLANNER>default</PLANNER>
        <ENGINE>default</ENGINE>
        <OPTIMIZATION_TIME>1000</OPTIMIZATION_TIME>
        <EXPECTED_SELECTIVITY>0.5</EXPECTED_SELECTIVITY>
    </EXECUTION>

    <OUTPUT>RDF</OUTPUT>
</LIMES>

The error:

10:29:14.909 [main] [] INFO  org.aksw.limes.core.io.query.SparqlQueryModule:67 - Querying the endpoint.
10:29:14.909 [main] [] INFO  org.aksw.limes.core.io.query.SparqlQueryModule:82 - Getting statements 0 to 1000
Exception in thread "main" HttpException: 0 Malformed URL: java.net.MalformedURLException: unknown protocol: c
        at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:319)
        at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:288)
        at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execResultSetInner(QueryEngineHTTP.java:352)
        at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:344)
        at org.aksw.limes.core.io.query.SparqlQueryModule.fillCache(SparqlQueryModule.java:117)
        at org.aksw.limes.core.io.query.SparqlQueryModule.fillCache(SparqlQueryModule.java:49)
        at org.aksw.limes.core.io.cache.HybridCache.getData(HybridCache.java:144)
        at org.aksw.limes.core.io.cache.HybridCache.getData(HybridCache.java:106)
        at org.aksw.limes.core.controller.Controller.getMapping(Controller.java:197)
        at org.aksw.limes.core.controller.Controller.getMapping(Controller.java:187)
        at org.aksw.limes.core.controller.Controller.main(Controller.java:97)