Closed Jason3900 closed 1 year ago
Hello,
We use f1 score with macro weighing. The implementation we use is that of sklearn.metrics.f1_score. Would you please let us know what f1 score implementation are you using?
On Wed, 30 Nov 2022, 23:32 jasonfang, @.***> wrote:
Hey, I found that the submission results of MDD task is not the same as the ones I run locally. I'm wondering how does that happen.
— Reply to this email directly, view it on GitHub https://github.com/Alue-Benchmark/alue_baselines/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADDF7H4SO3UVTFUFFIOWG4TWK7PYDANCNFSM6AAAAAASQFK7OU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Yeah, I use the exact same metric but it turns out to have 1pp gap. However, other f1-macro task such as fid doesn't seem to have this issue.
Would you please try using scikit-learn 1.0.1
, just to rule out any dependency related issues? I have also resent you the dataset splits.
@Jason3900 given our corrospondence via email, I believe this issue is solved now. I am gonna close it, but please feel free to open it otherwise.
Thanks a lot!
Hey, I found that the submission results of MDD task is not the same as the ones I run locally. I'm wondering how does that happen.