AIPHES / DiscoScore

DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence
32 stars 6 forks source link

Incorrect F score formula #3

Closed m0baxter closed 1 year ago

m0baxter commented 1 year ago

The formula used here:

https://github.com/AIPHES/DiscoScore/blob/4f2c5934eea8f3ea443e4a133d54277d6e32e23a/disco_score/metrics/discourse.py#L143

for the F-score version of focus score is incorrect. It should be

F = 2 / (1 /R + 1 / P)

I know this isn't explicitly used here but it is referenced in the paper.

andyweizhao commented 1 year ago

Hi @m0baxter, Yes. What we did is not the harmonic mean of P and R, but rather the average of the two. In our experiments, the latter performs better. Thanks for pointing this out! We will add a note in the updated paper.