Closed Yingshu-Li closed 6 months ago
Hi,
We use macro-weighted metrics for evaluation. You may also read the comment from this issue https://github.com/zhjohnchan/R2GenCMN/issues/12.
Best, Ethan
Hi,
I have tried your mentioned method. I used chexpert to label the report. I cannot reproduce the result. The scores are much lower.
here is my method:
Could you please help me to check if I have made any mistakes?
best
Hi,
We set -1 (i.e., Uncertain) to 1 in the calculation, similar to https://github.com/zzxslp/WCL/blob/main/chexpert-labeler/calculate_metric.py.
Could you provide the scores (NLG and CE scores) you obtained?
Best, Ethan
Hi,
are the scores reported in this paper macro, micro or macro-based F1 scores?
Thanks a lot in advance!