Problem of the f1 score on JNLPBA ?

Eulring commented 4 years ago

The result of local prediction I get on dataset JNLPBA is almost 0.72 which is far from 0.83 in the paper, is there anything wrong ? e.g. the metric select for multi-entities evaluation.

By the way, the results on the rest four datasets are agreed with the paper.

xhuang28 commented 4 years ago

Thanks for the comment! You are probably right. The scores are somewhat too high. In a previous submission of the paper (made by Li Dong before I joined) the score was 73.19 (you may use this score if you are comparing with our work). I revised the model later but it shouldn't make so much difference. Unfortunately I don't have access to the server that stores all the logs now. I only have some detailed scores (F/P/R for JNLPBA):

Model | F | P | R STM | 72.73 | 70.58 | 75.01 MTM | 73.48 | 70.83 | 76.33 UM-01 | 80.90 | 85.11 | 77.09 UM-11 | 80.85 | 76.56 | 85.65 UM-00 | 83.82 | 83.74 | 83.91

I'll look into it but I can't gurantee to find the correct scores. The local prediction is not the focus of the paper and we are not showing that we beat baselines on it. It's good to know the rest of the scores are correct.

Eulring commented 4 years ago

Got it. Thanks for the reply.

Eulring commented 4 years ago

Hello @xhuang28 , we are comparing your works in our new research. I wonder if it is convenient for you to provide the f1-scores of local evaluation on five training datasets(BC2GM, BC4CHEM, JNLPBA, NCBI, Linnaeus) in format of CRF00, CRF01, CRF11, and MTM.

xhuang28 commented 4 years ago

Model | BC2GM | BC4CHEM | NCBI | JNLPBA | Linnaeus STM | 79.87 | 88.59 | 84.11 | 72.73 | 87.33 MTM | 80.27 | 89.23 | 85.77 | 73.48 | 88.54 UM-01 | 70.94 | 83.47 | 79.81 | 80.90 | 79.94 UM-11 | 74.24 | 84.13 | 80.45 | 80.85 | 80.69 UM-00 | 79.12 | 87.27 | 83.98 | 83.82 | 83.93

xhuang28 / NewBioNer

Problem of the f1 score on JNLPBA ? #2