wikilinks / neleval

Entity disambiguation evaluation and error analysis tool
Apache License 2.0
116 stars 23 forks source link

Entity CEAF true positives and false positives off by factor of 2. #29

Closed cpalenmichel closed 7 years ago

cpalenmichel commented 7 years ago

I believe there is a 2 missing in the definition of the dice coefficient which causes the true positives and false positives to be off by a factor of 2. Line 343 of https://github.com/wikilinks/neleval/blob/master/neleval/coref_metrics.py Luckily this doesn't affect precision, recall or F-score since everything ends up off by a factor of two, so the issue is definitely not urgent, but should be an easy fix.

My understanding of CEAF is based on what I read in http://www.aclweb.org/anthology/H05-1004. On page 28 they define the similarity metrics. It's also possible I've misunderstood how this metric is calculated. If that's the case, kindly let me know.

jnothman commented 7 years ago

Thanks, I think you're right.

On 12 August 2017 at 00:12, cpalenmichel notifications@github.com wrote:

I believe there is a 2 missing in the definition of the dice coefficient which causes the true positives and false positives to be off by a factor of 2. Line 343 of https://github.com/wikilinks/neleval/blob/master/neleval/ coref_metrics.py Luckily this doesn't affect precision, recall or F-score since everything ends up off by a factor of two, so the issue is definitely not urgent, but should be an easy fix.

My understanding of CEAF is based on what I read in http://www.aclweb.org/ anthology/H05-1004. On page 28 they define the similarity metrics. It's also possible I've misunderstood how this metric is calculated. If that's the case, kindly let me know.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/wikilinks/neleval/issues/29, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEz68xqSlDp1mgr5MHpH2yNgRWy9AZuks5sXGFcgaJpZM4O0tXu .

jnothman commented 7 years ago

We've always tested on the basis of P, R, F so it's not surprising this was missed. Would you like to offer a Pull Request to patch both dice and _vectorized_dice? If not, I'll do it, but let me know. Thanks.

On 13 August 2017 at 22:57, Joel Nothman joel.nothman@gmail.com wrote:

Thanks, I think you're right.

On 12 August 2017 at 00:12, cpalenmichel notifications@github.com wrote:

I believe there is a 2 missing in the definition of the dice coefficient which causes the true positives and false positives to be off by a factor of 2. Line 343 of https://github.com/wikilinks/neleval/blob/master/neleval/cor ef_metrics.py Luckily this doesn't affect precision, recall or F-score since everything ends up off by a factor of two, so the issue is definitely not urgent, but should be an easy fix.

My understanding of CEAF is based on what I read in http://www.aclweb.org/anthology/H05-1004. On page 28 they define the similarity metrics. It's also possible I've misunderstood how this metric is calculated. If that's the case, kindly let me know.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/wikilinks/neleval/issues/29, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEz68xqSlDp1mgr5MHpH2yNgRWy9AZuks5sXGFcgaJpZM4O0tXu .

cpalenmichel commented 7 years ago

Sorry for the delay. Should be fixed here https://github.com/wikilinks/neleval/pull/32