dice-group / LIMES

Link Discovery Framework for Metric Spaces.
https://limes.demos.dice-research.org/
GNU Affero General Public License v3.0
129 stars 54 forks source link

java.lang.StringIndexOutOfBoundsException #246

Closed TBoonX closed 3 years ago

TBoonX commented 3 years ago

It is my first time with limes and I just want one simple working configuration.

Config file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE LIMES SYSTEM "limes.dtd">
<LIMES>
<PREFIX>
  <NAMESPACE>http://www.w3.org/2000/01/rdf-schema#</NAMESPACE>
  <LABEL>rdfs</LABEL>
</PREFIX>
<PREFIX>
  <NAMESPACE>http://dbpedia.org/resource/Category:</NAMESPACE>
  <LABEL>dbr</LABEL>
</PREFIX>
<PREFIX>
  <NAMESPACE>http://purl.org/dc/terms/</NAMESPACE>
  <LABEL>dct</LABEL>
</PREFIX>
<PREFIX>
  <NAMESPACE>http://test.de/</NAMESPACE>
  <LABEL>test</LABEL>
</PREFIX>

<SOURCE>
  <ID>sourceId</ID>
  <ENDPOINT>/Users/kurt/Documents/Spring/dataretrieval/example_data.ttl</ENDPOINT>
  <VAR>?s</VAR>
  <PAGESIZE>1000</PAGESIZE>
  <RESTRICTION></RESTRICTION>
  <PROPERTY>test:repotype</PROPERTY>

  <TYPE>TURTLE</TYPE>
</SOURCE>
<TARGET>
  <ID>targetId</ID>
  <ENDPOINT>http://dbpedia.org/sparql</ENDPOINT>
  <VAR>?t</VAR>
  <PAGESIZE>1000</PAGESIZE>
  <RESTRICTION>?t dct:subject dbr:International_cultural_organizations</RESTRICTION>
  <PROPERTY>rdfs:label</PROPERTY>

  <TYPE>sparql</TYPE>
</TARGET>
<METRIC>
  exactmatch(s.test:repotype,t.rdfs:label|0.9
</METRIC>
<ACCEPTANCE>
  <THRESHOLD>0.98</THRESHOLD>
  <FILE>accepted.nt</FILE>
  <RELATION>owl:sameAs</RELATION>
</ACCEPTANCE>
<REVIEW>
  <THRESHOLD>0.9</THRESHOLD>
  <FILE>reviewme.nt</FILE>
  <RELATION>owl:sameAs</RELATION>
</REVIEW>
<EXECUTION>
  <REWRITER>DEFAULT</REWRITER>
  <PLANNER>DEFAULT</PLANNER>
  <ENGINE>DEFAULT</ENGINE>
</EXECUTION>
<OUTPUT>TAB</OUTPUT>
</LIMES>

Data file example_data.ttl:

<http://nomad-lab.eu/experiment/baBkU4yx2wQ1WIQJxuEZhKWq02bS-PACv39BKSFKLtphwxzY-Og>      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://emmo.info/emmo/Experiment> .
<http://nomad-lab.eu/experiment/baBkU4yx2wQ1WIQJxuEZhKWq02bS-PACv39BKSFKLtphwxzY-Og>      <http://emmo.info/emmo/hasParticipant>
"1d279d31-6a26-42a1-8591-870303da6f04" .
<http://nomad-lab.eu/experiment/baBkU4yx2wQ1WIQJxuEZhKWq02bS-PACv39BKSFKLtphwxzY-Og>      <http://www.w3.org/ns/dcat#modified>
"2020-09-23T14:37:23.645582" .
<http://nomad-lab.eu/experiment/baBkU4yx2wQ1WIQJxuEZhKWq02bS-PACv39BKSFKLtphwxzY-Og> <http://www.w3.org/2000/01/rdf-schema#label> "O2" .
<http://nomad-lab.eu/experiment/baBkU4yx2wQ1WIQJxuEZhKWq02bS-PACv39BKSFKLtphwxzY-Og> <http://test.de/repotype> "NOMAD" .

Console output:

2021-05-07 17:03:00,294 main INFO Log4j appears to be running in a Servlet environment, but there's no log4j-web module available. If you want better web container support, please add the log4j-web JAR to your web archive or server lib directory.
17:03:00.412 [main] [] ERROR org.aksw.limes.core.io.config.Configuration:196 - Undefined prefix: owl
17:03:00.424 [main] [] WARN  org.aksw.limes.core.io.config.reader.xml.XMLConfigurationReader:498 - null
java.lang.RuntimeException
    at org.aksw.limes.core.io.config.Configuration.setAcceptanceRelation(Configuration.java:197)
    at org.aksw.limes.core.io.config.reader.xml.XMLConfigurationReader.validateAndRead(XMLConfigurationReader.java:420)
    at org.aksw.limes.core.io.config.reader.xml.XMLConfigurationReader.read(XMLConfigurationReader.java:206)
    at org.aksw.limes.core.controller.Controller.getConfig(Controller.java:164)
    at org.aksw.limes.core.controller.Controller.main(Controller.java:86)
17:03:00.424 [main] [] WARN  org.aksw.limes.core.io.config.reader.xml.XMLConfigurationReader:500 - Some values were not set. Crossing my fingers and using defaults.
17:03:00.425 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:115 - Checking for file /Users/kurt/Documents/STREAM/LIMES/cache/781314872.ser
17:03:00.426 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:118 - Found cached data. Loading data from file /Users/kurt/Documents/STREAM/LIMES/cache/781314872.ser
17:03:00.437 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:124 - Cached data loaded successfully from file /Users/kurt/Documents/STREAM/LIMES/cache/781314872.ser
17:03:00.438 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:125 - Size = 1
17:03:00.438 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:115 - Checking for file /Users/kurt/Documents/STREAM/LIMES/cache/801983563.ser
17:03:00.439 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:118 - Found cached data. Loading data from file /Users/kurt/Documents/STREAM/LIMES/cache/801983563.ser
17:03:00.451 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:124 - Cached data loaded successfully from file /Users/kurt/Documents/STREAM/LIMES/cache/801983563.ser
17:03:00.452 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:125 - Size = 72
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: begin 11, end -1, length 43
    at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3720)
    at java.base/java.lang.String.substring(String.java:1909)
    at org.aksw.limes.core.io.parser.Parser.getTerms(Parser.java:194)
    at org.aksw.limes.core.io.parser.Parser.<init>(Parser.java:40)
    at org.aksw.limes.core.io.ls.LinkSpecification.readSpec(LinkSpecification.java:187)
    at org.aksw.limes.core.io.ls.LinkSpecification.<init>(LinkSpecification.java:76)
    at org.aksw.limes.core.controller.LSPipeline.execute(LSPipeline.java:51)
    at org.aksw.limes.core.controller.Controller.getMapping(Controller.java:214)
    at org.aksw.limes.core.controller.Controller.getMapping(Controller.java:177)
    at org.aksw.limes.core.controller.Controller.main(Controller.java:87)
kvndrsslr commented 3 years ago

You are missing a closing parenthesis in your link specification. Also, if you are specifying a single atomic link specification you do not need to guard it with a threshold, i.e.

<METRIC>
  exactmatch(s.test:repotype,t.rdfs:label|0.9
</METRIC>

should be

<METRIC>
  exactmatch(s.test:repotype,t.rdfs:label)
</METRIC>

Let me know if that works

TBoonX commented 3 years ago

Thanks for the solution, the error is gone.