Closed Murchana closed 2 years ago
Hey @Murchana, sorry for the late reply (I didn't receive any email notification). The code was under major improvement during the time you reported this issue; I will have a look and get back to you later.
Hey @Murchana, I have re-run the evaluation code, and the results are still the same as before.
python om_eval.py -p experiments/umls/snomed2fma.body.us/edit_sim/pair_score/src2tgt -a data/UMLS/equiv_match/refs/snomed2fma.body/unsupervised/test.cands.tsv --hits_at 1
which gave me:
######################################################################
### Eval using Hits@K, MRR ###
######################################################################
635619/635619 of scored mappings are filled to corresponding anchors.
{
"MRR": 0.895,
"Hits@1": 0.869
}
Could you please check the most recent usage of the evaluation script and download the most recent data resources where the anchored mappings are simplified to a .tsv
format?
I have re-run it again and the MRR
and Hits@K
values have changed a bit; the reason for that is the sorting of entities of the same mapping value is somewhat random, for example:
# EditSim Output 1 Snapshot
http://snomed.info/id/82095009 http://purl.org/sig/ont/fma/fma58709 1.0
http://snomed.info/id/82095009 http://purl.org/sig/ont/fma/fma58708 0.8055555555555556
http://snomed.info/id/82095009 http://purl.org/sig/ont/fma/fma58714 0.8055555555555556
# EditSim Output 2 Snapshot
http://snomed.info/id/82095009 http://purl.org/sig/ont/fma/fma58709 1.0
http://snomed.info/id/82095009 http://purl.org/sig/ont/fma/fma58714 0.8055555555555556
http://snomed.info/id/82095009 http://purl.org/sig/ont/fma/fma58708 0.8055555555555556
And the evaluation results for Output2 are shown below:
######################################################################
### Eval using Hits@K, MRR ###
######################################################################
635619/635619 of *unique* scored mappings are valid and filled to corresponding anchors.
659530/659530 of anchored mappings are scored; for local ranking evaluation, all anchored mappings should be scored.
{
"MRR": 0.892,
"Hits@1": 0.865,
"Hits@5": 0.926,
"Hits@10": 0.946,
"Hits@30": 0.977,
"Hits@100": 1.0
}
Thanks!
Another question related to BertMap: How many candidates are used for calculating the MRR and Hit@1?
As stated in the resource paper, all the systems are evaluated (for local ranking) against 100 negative candidates, so overall 101 (including the reference mapping itself) candidates.
Hi, I am not able to generate the exact H@1 and MRR for EditSim for the FMA SNOMED task as reported in Table 4 in https://arxiv.org/pdf/2205.03447.pdf.
This is the command used:
python om_eval.py --saved_path './om_results' --pred_path './onto_match_experiment2/edit_sim/global_match/src2tgt' --ref_anchor_path 'data/equiv_match/refs/snomed2fma.body/unsupervised/src2tgt.rank/for_eval' --hits_at 1
These are the generated numbers: H@1: .841 and MRR: .89 Reported nos. in the paper: H@1: 869 and MRR: .895
I am not sure why the numbers are not consistent. Is there anything that needs to be modified in the code to get the reported numbers?