dice-group / LIMES-legacy

Repository of LIMES releases
GNU Affero General Public License v3.0
12 stars 8 forks source link

Can not address csv column with a space character #9

Closed earthquakesan closed 10 years ago

earthquakesan commented 10 years ago

Log:

ivan@ivan-Latitude-E6520:~/Soft/Installed/LIMESRC3.2$ java -jar LIMES.jar /tmp/limeslinkcsvtodbpedia.xml 
 WARN [main] (ConfigReader.java:260) - 
java.lang.NullPointerException
    at de.uni_leipzig.simba.io.ConfigReader.validateAndRead(ConfigReader.java:229)
    at de.uni_leipzig.simba.io.ConfigReader.validateAndRead(ConfigReader.java:315)
    at de.uni_leipzig.simba.controller.PPJoinController.run(PPJoinController.java:133)
    at de.uni_leipzig.simba.controller.PPJoinController.main(PPJoinController.java:32)
 WARN [main] (ConfigReader.java:262) - Some values were not set. Crossing my fingers and using defaults.
 INFO [main] (PPJoinController.java:135) - ID: CERL
Var: ?x
Prefixes: {dc=http://purl.org/dc/terms/, rdfs=http://www.w3.org/2000/01/rdf-schema#, drugbank=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/, dbo=http://dbpedia.org/ontology/, foaf=http://xmlns.com/foaf/0.1/, owl=http://www.w3.org/2002/07/owl#, rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#}
Endpoint: /home/ivan/.virtualenvs/csv2rdf-publicdataeu/src/CSV2RDF-WIKI/files/csv/02f31d80-40cc-496d-ad79-2cf02daa5675
Graph: null
Restrictions: []
Properties: [Department Family]
Functions: {Department Family={Department Family=lowercase}}
Page size: -1
Type: csv

 INFO [main] (PPJoinController.java:136) - ID: CERL
Var: ?y
Prefixes: {dc=http://purl.org/dc/terms/, rdfs=http://www.w3.org/2000/01/rdf-schema#, drugbank=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/, dbo=http://dbpedia.org/ontology/, foaf=http://xmlns.com/foaf/0.1/, owl=http://www.w3.org/2002/07/owl#, rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#}
Endpoint: http://dbpedia.org/sparql
Graph: null
Restrictions: [?y rdf:type dbo:Organisation]
Properties: [rdfs:label]
Functions: {rdfs:label={rdfs:label=nolang->lowercase}}
Page size: -1
Type: sparql

 INFO [main] (PPJoinController.java:141) - Loading source data ...
 INFO [main] (HybridCache.java:231) - Checking for file /home/ivan/Soft/Installed/LIMESRC3.2/cache/-59200047.ser
 INFO [main] (HybridCache.java:234) - Found cached data. Loading data from file /home/ivan/Soft/Installed/LIMESRC3.2/cache/-59200047.ser
 INFO [main] (HybridCache.java:240) - Cached data loaded successfully from file /home/ivan/Soft/Installed/LIMESRC3.2/cache/-59200047.ser
 INFO [main] (HybridCache.java:241) - Size = 1
 INFO [main] (PPJoinController.java:146) - Loading target data ...
 INFO [main] (HybridCache.java:231) - Checking for file /home/ivan/Soft/Installed/LIMESRC3.2/cache/181749804.ser
 INFO [main] (HybridCache.java:234) - Found cached data. Loading data from file /home/ivan/Soft/Installed/LIMESRC3.2/cache/181749804.ser
 INFO [main] (HybridCache.java:240) - Cached data loaded successfully from file /home/ivan/Soft/Installed/LIMESRC3.2/cache/181749804.ser
 INFO [main] (HybridCache.java:241) - Size = 10543
 INFO [main] (PPJoinController.java:159) - Getting links ...
Got mapper with name <EDJoin> for expression <levenshtein>
Got measure levenshtein for name <levenshtein>
 WARN [main] (Instance.java:99) - Failed to access property <DepartmentFamily> on Department of Health
 INFO [main] (PPJoinController.java:162) - Got links in 541ms.
 INFO [main] (SerializerFactory.java:22) - Getting serializer with name null
 INFO [main] (PPJoinController.java:172) - Using N3Serializer to serialize
 INFO [main] (PPJoinController.java:191) - Returned 0 links above acceptance threshold.
 INFO [main] (PPJoinController.java:192) - Returned 0 links to review.
 INFO [main] (PPJoinController.java:199) - Mapping carried out in 1.231 seconds
 INFO [main] (PPJoinController.java:200) - Done.

My config:

<?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSpy v2010 rel. 3 sp1 (http://www.altova.com)-->
<!DOCTYPE LIMES SYSTEM "limes.dtd">
<LIMES>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/1999/02/22-rdf-syntax-ns#</NAMESPACE>
        <LABEL>rdf</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2000/01/rdf-schema#</NAMESPACE>
        <LABEL>rdfs</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://xmlns.com/foaf/0.1/</NAMESPACE>
        <LABEL>foaf</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2002/07/owl#</NAMESPACE>
        <LABEL>owl</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/</NAMESPACE>
        <LABEL>drugbank</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://dbpedia.org/ontology/</NAMESPACE>
        <LABEL>dbo</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://purl.org/dc/terms/</NAMESPACE>
        <LABEL>dc</LABEL>
    </PREFIX>
    <SOURCE>
        <ID>CERL</ID>
        <ENDPOINT>/home/ivan/.virtualenvs/csv2rdf-publicdataeu/src/CSV2RDF-WIKI/files/csv/02f31d80-40cc-496d-ad79-2cf02daa5675</ENDPOINT>
        <VAR>?x</VAR>
        <PAGESIZE>-1</PAGESIZE>
        <RESTRICTION></RESTRICTION>
        <PROPERTY>Department Family AS lowercase</PROPERTY>
        <TYPE>csv</TYPE>
    </SOURCE>
    <TARGET>
        <ID>CERL</ID>
        <ENDPOINT>http://dbpedia.org/sparql</ENDPOINT>
        <VAR>?y</VAR>
        <PAGESIZE>-1</PAGESIZE>
        <RESTRICTION>?y rdf:type dbo:Organisation</RESTRICTION>
        <PROPERTY>rdfs:label AS nolang->lowercase</PROPERTY>
        <TYPE>sparql</TYPE>
    </TARGET>
    <METRIC>levenshtein(x.Department Family, y.rdfs:label)</METRIC>
    <ACCEPTANCE>
        <THRESHOLD>0.95</THRESHOLD>
        <FILE>/tmp/foo.nt</FILE>
        <RELATION>owl:sameAs</RELATION>
    </ACCEPTANCE>
    <REVIEW>
        <THRESHOLD>0.9</THRESHOLD>
        <FILE>/tmp/foo2.nt</FILE>
        <RELATION>owl:sameAs</RELATION>
    </REVIEW>
</LIMES>
earthquakesan commented 10 years ago

The CSV file to link: https://dl.dropboxusercontent.com/u/4882345/02f31d80-40cc-496d-ad79-2cf02daa5675

amrapalijz commented 8 years ago

Was this solved? I am getting the same error. WARN main - Failed to access property on http://bio2rdf.org/geo:GSM226973/disease%20state and then 0 links.

ngonga commented 8 years ago

Please use shortened URIs when using LIMES.

amrapalijz commented 8 years ago

Ok, thanks. Works now.

FatemeSH commented 8 years ago

I have same error what do you mean by using "shortened URIs"?

earthquakesan commented 8 years ago

@FatemeSH this is shortened URI: dbo:Organisation. Not shortened version of the same URI: http://dbpedia.org/ontology/Organisation