dice-group / LIMES

Link Discovery Framework for Metric Spaces.
https://limes.demos.dice-research.org/
GNU Affero General Public License v3.0
129 stars 54 forks source link

Output tag ignored? #245

Closed KonradHoeffner closed 2 years ago

KonradHoeffner commented 3 years ago

I tried both <OUTPUT>CSV</OUTPUT> and <OUTPUT>TAB</OUTPUT> but each time LIMES 1.7.4 generates the output as turtle file.

Console Output

limes$ dickinson-ehr.xml
2021-03-30 17:16:04,445 main INFO Log4j appears to be running in a Servlet environment, but there's no log4j-web module available. If you want better web container support, please add the log4j-web JAR to your web archive or server lib directory.
17:16:04.537 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:115 - Checking for file /home/konrad/projekte/hito/ontology/limes/cache/-2145805653.ser
17:16:04.543 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:118 - Found cached data. Loading data from file /home/konrad/projekte/hito/ontology/limes/cache/-2145805653.ser
17:16:04.555 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:124 - Cached data loaded successfully from file /home/konrad/projekte/hito/ontology/limes/cache/-2145805653.ser
17:16:04.556 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:125 - Size = 47
17:16:04.557 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:115 - Checking for file /home/konrad/projekte/hito/ontology/limes/cache/1156067003.ser
17:16:04.557 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:118 - Found cached data. Loading data from file /home/konrad/projekte/hito/ontology/limes/cache/1156067003.ser
17:16:04.577 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:124 - Cached data loaded successfully from file /home/konrad/projekte/hito/ontology/limes/cache/1156067003.ser
17:16:04.577 [main] [] INFO  org.aksw.limes.core.io.cache.HybridCache:125 - Size = 326
17:16:04.832 [main] [] INFO  org.aksw.limes.core.controller.Controller:222 - Mapping task finished in 214 ms
17:16:04.834 [main] [] INFO  org.aksw.limes.core.controller.Controller:226 - Mapping size: 38 (accepted) + 873 (need verification) = 911 (total)
17:16:04.834 [main] [] INFO  org.aksw.limes.core.controller.Controller:93 - Writing result files...
17:16:04.835 [main] [] INFO  org.aksw.limes.core.io.serializer.SerializerFactory:15 - Getting serializer with name CSV
17:16:04.843 [main] [] INFO  org.aksw.limes.core.controller.Controller:96 - Writing statistics file...
limes$ ls di-ehr-close.*
di-ehr-close.ttl

Configuration File

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE LIMES SYSTEM "limes.dtd">
<LIMES>
    <PREFIX>
        <NAMESPACE>http://hitontology.eu/ontology/</NAMESPACE>
        <LABEL>hito</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/1999/02/22-rdf-syntax-ns#</NAMESPACE>
        <LABEL>rdf</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2000/01/rdf-schema#</NAMESPACE>
        <LABEL>rdfs</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2002/07/owl#</NAMESPACE>
        <LABEL>owl</LABEL>
    </PREFIX>
    <PREFIX>
        <NAMESPACE>http://www.w3.org/2004/02/skos/core#</NAMESPACE>
        <LABEL>skos</LABEL>
    </PREFIX>

    <SOURCE>
        <ID>dickinson</ID>
        <ENDPOINT>../dickinson.ttl</ENDPOINT>
        <VAR>?di</VAR>
        <PAGESIZE>-1</PAGESIZE>
        <RESTRICTION>?di a hito:FeatureClassified; hito:featureCatalogue hito:Dickinson.</RESTRICTION>
        <PROPERTY>rdfs:label AS nolang->lowercase->regularalphabet RENAME label</PROPERTY>
        <TYPE>TURTLE</TYPE>
    </SOURCE>

    <TARGET>
        <ID></ID>
        <ENDPOINT>../hl7ehrsfm.ttl</ENDPOINT>
        <VAR>?ehr</VAR>
        <PAGESIZE>-1</PAGESIZE>
        <RESTRICTION>?ehr a hito:FeatureClassified; hito:featureCatalogue hito:EhrSfmFeatureCatalogue.</RESTRICTION>
        <PROPERTY>rdfs:label AS nolang->lowercase->regularalphabet RENAME label</PROPERTY>
        <TYPE>TURTLE</TYPE>
    </TARGET>

<METRIC>trigrams(di.label,ehr.label)</METRIC>

    <ACCEPTANCE>
        <THRESHOLD>0.7</THRESHOLD>
        <FILE>di-ehr-close.ttl</FILE>
        <RELATION>skos:closeMatch</RELATION>
    </ACCEPTANCE>

    <REVIEW>
        <THRESHOLD>0.2</THRESHOLD>
        <FILE>di-ehr-far.ttl</FILE>
        <RELATION>hito:farMatch</RELATION>
    </REVIEW>

    <EXECUTION>
        <REWRITER>default</REWRITER>
        <PLANNER>default</PLANNER>
        <ENGINE>default</ENGINE>
    </EXECUTION>

    <OUTPUT>CSV</OUTPUT>
</LIMES>
abdullahfathi commented 2 years ago

The reason is that you explicitly defined the file in ttl format <FILE>di-ehr-close.ttl</FILE> so that in such case the extension of the file will be inttl regardless of the output format.

KonradHoeffner commented 2 years ago

Shouldn't it be the other way around? Intuitively for me, the format should have precedence.