Performance Drop Issues (AIDA test A)

liehe commented 7 years ago

According to the paper, the performance PBoH on AIDA test A is 86.63/85.48. Due to the upgrade of gerbil, the performance of PBoH is give here is 75.19/73.3.

However, when try to reproduce the result, it gives the following result (64.84/64.32).

############### RESULTS for dataset AIDA test A for TEST w = loopybeliefpropagation.ScorerWeights@3cb5cdba params a = 0.5, f = 1.0, g = 0.5, h = 1.0, s = 0.0, b = 0.075 ################# Num total docs = 216 Num total mentions (including duplicates) = 4781

Looking at docs with GLOBAL mentions: GLOBAL mentions : num docs evaluated = 216; num mentions in solution = 4065.0 num mentions in ground truth = 4781.0

################################# GLOBAL mentions : micro F1 (per mention) Loopy : 64.84286683246664 GLOBAL mentions : micro accuracy/recall (per mention) Loopy : 59.987450324199955 GLOBAL mentions : MACRO F1 (per doc) Loopy : 64.3229987260122 GLOBAL mentions : MACRO accuracy/recall (per doc) Loopy : 58.679661769593 ###################################

GLOBAL mentions : micro F1 (per mention) ARGMAX : 62.85326701333936 GLOBAL mentions : micro acc/recall (per mention) ARGMAX : 58.1468312068605 GLOBAL mentions : MACRO F1 (per doc) ARGMAX : 61.799181989071386 GLOBAL mentions : MACRO acc/recall (per doc) ARGMAX : 56.37695767780255

GLOBAL mentions : MACRO (per doc) common Loopy - ARGMAX : 92.41201422346343 GLOBAL mentions : micro (per mention) common Loopy - ARGMAX : 94.98154981549816 GLOBAL mentions : micro (per mention) perc missing mentions from index : 14.975946454716587 GLOBAL mentions : micro (per mention) perc missing entities from mention index : 17.025726835390085

GLOBAL mentions : avg LBP running time (milliseconds) : 71.80092592592592 GLOBAL mentions : avg num iters in LBP : 2.6805555555555554 GLOBAL mentions : percentage cases where LBP converged: 100.0 GLOBAL mentions : avg num candidates per mention: 5.67029883619935

==============================================

I used the index file from polybox. The location are index files are updated.
I changed from

val file = "/media/hofmann-scratch/Octavian/entity_linking/marinah/AIDA/testa_testb_aggregate"

to "AIDA-YAGO2-dataset.tsv" which is generated by files downloaded from MPI-info.

I use

java -Xmx90g -cp target/PBoH-1.0-SNAPSHOT-jar-with-dependencies.jar el.EL_LBP_Spark testPBOHOnAllDatasets max-product

to run the code because the command

scala -J-Xmx90g target/PBoH-1.0-SNAPSHOT-jar-with-dependencies.jar testPBOHOnAllDatasets max-product

will generate a UnstaisfiedLinkError when it trys to use leveldbjni.

Did I made any mistakes in the process? How can I reproduce the result in Gerbil?

Thanks.

octavian-ganea commented 7 years ago

The ARGMAX results represent the "Local Mention" prior and they should be much higher cf Table 3 in our paper. What p(e|m) indexes do you use ? Are you using the ones that we provided, to be found here: https://polybox.ethz.ch/index.php/s/IOWjGrU3mjyzDSV/authenticate

It seems there is a big overlap between PBOH and LocalMention (the "common Loopy - ARGMAX" part). I will try to re-run it tonight on a fresh machine if you still cannot solve this issue. Can you please send me your full output log file by e-mail ?

liehe commented 7 years ago

Hi, thanks for your fast response.

I have not changed the method or the index itself. All I have changed is updating the index address in code, add UTF-8 encoding when using Source.fromFile(), and the AIDA dataset name (The one given in AIDA.scala is "testa_testb_aggregate" which I didn't a file with this name so I used the output file from "aida-yago2-dataset.jar" ). Also, I only ran the AIDA test A and ignored all the other dataset to save time.

I am going run it again to see if the result is the same. If so, I will send you the output log.

octavian-ganea commented 7 years ago

I am not sure what is the output file from "aida-yago2-dataset.jar", but your testa_testb_aggregate should contain the AIDA-A and AIDA-B datasets and be generated as described on the MPI website. It should look as follows (sorry, it has a license from MPI and I cannot upload the full file myself). One word per each line, with annotations when the word is part of a mention, tab separated:

-DOCSTART- (947testa CRICKET)
CRICKET
-
LEICESTERSHIRE  B   LEICESTERSHIRE  Leicestershire_County_Cricket_Club  http://en.wikipedia.org/wiki/Leicestershire_County_Cricket_
Club    1622318 /m/05hf4j
TAKE
OVER
AT
TOP
AFTER
INNINGS
VICTORY
.

LONDON  B   LONDON  London  http://en.wikipedia.org/wiki/London 17867   /m/04jpl
1996-08-30

West    B   West Indian West_Indies_cricket_team    http://en.wikipedia.org/wiki/West_Indies_cricket_team   3379941 /m/098knd
Indian  I   West Indian West_Indies_cricket_team    http://en.wikipedia.org/wiki/West_Indies_cricket_team   3379941 /m/098knd
all-rounder
Phil    B   Phil Simmons    Phil_Simmons    http://en.wikipedia.org/wiki/Phil_Simmons   2518836 /m/07kgj4
Simmons I   Phil Simmons    Phil_Simmons    http://en.wikipedia.org/wiki/Phil_Simmons   2518836 /m/07kgj4
took
four
for
38
on
Friday
as
Leicestershire  B   Leicestershire  Leicestershire_County_Cricket_Club  http://en.wikipedia.org/wiki/Leicestershire_County_Cricket_
Club    1622318 /m/05hf4j
beat
Somerset    B   Somerset    Somerset_County_Cricket_Club    http://en.wikipedia.org/wiki/Somerset_County_Cricket_Club   162
2178    /m/05hdty
by
an
innings
and
39
runs
in
two
days
to
take
over
at
the
head
of
the
county
championship
.

Their
stay
on
top
,
though
,
may
be
short-lived
as
title
rivals
Essex   B   Essex   Essex_County_Cricket_Club   http://en.wikipedia.org/wiki/Essex_County_Cricket_Club  1622252 /m/05hdzj
,
Derbyshire  B   Derbyshire  Derbyshire_County_Cricket_Club  http://en.wikipedia.org/wiki/Derbyshire_County_Cricket_Club 182
9984    /m/05_blf
and

liehe commented 7 years ago

My dataset do have these lines, so the dataset should be fine.

octavian-ganea commented 7 years ago

Something is clearly wrong with the p(e|m) index that you use. "perc missing mentions from index : 14.97" is the percentange of mentions m that are not found in the dictionary, while "perc missing entities from mention index : 17.02" is the percentage of gold entities that do not appear in the respective mention entry. These should be together less than 5%. Looking at your log file I see that even common names like "kurdish", "tunisia" or "boston" are missing. Can you please check if they appear in your p(e|m) file (called mek-top-freq-crosswikis-plus-wikipedia-lowercase-top64.txt which should be constructed as a concatenation of the 2 files mek-top-freq-crosswikis-plus-wikipedia-lowercase-top64.txt.part_a*) ?

octavian-ganea commented 7 years ago

This should give a non-empty output:

cat mek-top-freq-crosswikis-plus-wikipedia-lowercase-top64.txt.part_a* | grep -P '^boston\t' | more

namely:

boston  10  10  113041  24437894,85112  167665,4306 65194,3140  43376,1974  69523,691   4339,596    23017869,45
1   882398,417  201767,383  182265,347  730207,324  126401,318  18346514,285    2323878,278 83622,264   550
3022,263    2004519,247 513495,225  1080900,217 211579,207  10128235,199    23876058,190    5637547,179 4608353,159
    876997,158  6721569,150 26372818,150    2593807,142 1692165,139 82254,138   2338329,127 110372,127  288
00877,127   1423832,127 24239512,118    1843613,117 8871435,115 4319938,105 12195659,104    112680,102  206780,10061114,100 621979,98   3387737,92  13436708,91 13832944,90 117682,88   21441922,86 4584803,86  5889873,82230105,68 8305575,67  30055728,60 213128,60   1277059,60  148265,59   1294905,56  23911424,56 206779,53   298047,51   11773015,50 10982304,50 12202877,45 25409421,45

You need to create a new file containing the contents of both files mek-top-freq-crosswikis-plus-wikipedia-lowercase-top64.txt.part_a*, and update its path here: https://github.com/dalab/pboh-entity-linking/blob/master/src/main/scala/index/AllIndexesBox.scala#L19 . Similarly, you need to update the paths of all other index files listed in the same scala file. Let me know if it works.

liehe commented 7 years ago

Thank you so much. It is hard for me to target the problem. I will check the indexes.

dalab / pboh-entity-linking

Performance Drop Issues (AIDA test A) #3