dwslab / melt

MELT - Matching EvaLuation Toolkit
MIT License
47 stars 12 forks source link

Wrong performance.csv #227

Open FelixFrizzy opened 1 week ago

FelixFrizzy commented 1 week ago

Describe the bug

When using the evaluation-client to run logmap-bio with the DH tracks, I get a wrong performance.csv for the "tadirah-unseco" test case. It shows me 0 TP, but the "true" number is higher when looking into the systemAlignment.rdf and compare it to the reference.rdf of the test case. I think it would be important to find out if this is an issue on the matcher's side (unproblematic) or an issue in MELT. The latter would be more problematic since the evaluation is highly based on the numbers of the performane.csv files.

To Reproduce

Steps to reproduce the behavior:

performance.csv:

Type,Precision (P),Recall (R),Residual Recall (R+),F1,"# of TP","# of FP","# of FN","# of Correspondences",Time,Time (HH:MM:SS)
ALL,0.0,0.0,0.0,0.0,0,0,15,0,3763809750,00:00:03
CLASSES,0.0,0.0,0.0,0.0,0,0,0,0,-,-
PROPERTIES,0.0,0.0,0.0,0.0,0,0,0,0,-,-
INSTANCES,0.0,0.0,0.0,0.0,0,0,15,0,-,-
systemAlignment.rdf: (removed unneeded alignments for readability) ``` yes 0 ?? https://vocabs.dariah.eu/tadirah/ http://logmap-tests/oaei/target.owl https://vocabs.dariah.eu/tadirah/ http://logmap-tests/oaei/target.owl 1.0 = ... ```
[reference.rdf](https://github.com/FelixFrizzy/DH-benchmark/blob/main/dhcs2_tadirah-unesco/reference.rdf) of tadriah-unesco test case (removed unneeded alignments for readability) ``` yes 0 ** https://vocabs.dariah.eu/tadirah/ http://vocabularies.unesco.org/thesaurus/ https://vocabs.dariah.eu/tadirah/ http://vocabularies.unesco.org/thesaurus/ 1.0 = ... ```

The correctly identified alignment is not reflected in the perfomance.csv (along with all the other TP's)

Full log output

issue.log

Expected behavior

The performance.csv should list 10 TP and 5 FP instead of 0TP and 15FP.

sven-h commented 1 week ago

Hi,

thanks for raising this issue. Can you maybe check if it also happens within one test case and the code provided below:

ExecutionResultSet ers = Executor.run(TrackRepository.DigitalHumanities.All2024.getTestCase("..."),new ForwardAlwaysMatcher("./system.xml"));
EvaluatorCSV evaluatorCSV = new EvaluatorCSV(ers);
evaluatorCSV.writeToDirectory();

Then we have a small reproducible setup, and I can check where the error actually appears. Thanks

FelixFrizzy commented 1 week ago

If I understand the source code correctly, the system.xml file should be the reference.rdf of the track? I then get when using

ExecutionResultSet ers = Executor.run(TrackRepository.DigitalHumanities.Dhcs2024.getTestCase(1),new ForwardAlwaysMatcher("./system.xml"));

for the trackPerformanceCube.csv:

Track,Track Version,Test Case,Matcher,Type,Precision (P),Recall (R),Residual Recall (R+),F1,"# of TP","# of FP","# of FN","# of Correspondences",Time,Time (HH:MM:SS)
dh,2024dhcs,tadirah-unesco,ForwardAlwaysMatcher,ALL,1.0,1.0,1.0,1.0,15,0,0,15,98571192,00:00:00
dh,2024dhcs,tadirah-unesco,ForwardAlwaysMatcher,CLASSES,0.0,0.0,0.0,0.0,0,0,0,0,-,-
dh,2024dhcs,tadirah-unesco,ForwardAlwaysMatcher,PROPERTIES,0.0,0.0,0.0,0.0,0,0,0,0,-,-
dh,2024dhcs,tadirah-unesco,ForwardAlwaysMatcher,INSTANCES,1.0,1.0,1.0,1.0,15,0,0,15,-,-
FelixFrizzy commented 1 week ago

Some more details: I used the systemAlignment.rdf as system.xml (which makes more sense to test everything). When doing so, i get an empty trackPerformanceCube.csv with headers only.

I tracked the problem down, it seems that it is a problem on logmap-bio's side.

The systemAlignment contains lines like this:

<map>
    <Cell>
        <entity1 rdf:resource="http://tadirah.dariah.eu/vocab/index.php?tema=24&/editing"/>
        <entity2 rdf:resource="http://vocabularies.unesco.org/thesaurus/concept3810"/>
        <measure rdf:datatype="xsd:float">1.0</measure>
        <relation>=</relation>
    </Cell>
</map>

The resource of entity1 should be https://vocabs.dariah.eu/tadirah/editing, logmap-bio got this somehow wrong and used the skos:closeMatch URI for some reason. But only some of the entities have the wrong URI, most of them are correct.

It would be nice if the correct mappings were also reflected in the output CSV, but I'm not sure if that is possible.