shun-zheng / Doc2EDAG

MIT License
340 stars 72 forks source link

Maybe the `gold_span` is mixing with `pred_span` when evaluating in `pred_span` mode? #19

Closed Spico197 closed 3 years ago

Spico197 commented 3 years ago

Hi, there. Thanks for your excellent work.

In the code here, you just use the gold span if there is no pred span. Considering the situation of evaluating and predicting, is there a risk leading an evaluation bias (since we still can get the right span predictions if there is no span prediction outputs)?

Thank you very much if you could comment and give more explanations.

shun-zheng commented 3 years ago

Well, you are right. It indeed has the risk of leaking gold spans, but such risk is nearly negligible when the model is well trained.

The code here is just a hot fix to avoid the training crash at an early stage. Once the model is well trained, the NER module can always produce some candidate spans, so the length of span_token_tup_list will be greater than zero and the model will not use those gold spans. According to our experience on the ChFinAnn dataset, a well-trained model always produces some candidate spans, so the forementioned risk is limited.

We appreciate your findings, it should be better if we disable that statement for evaluation to be risk-free.

Spico197 commented 3 years ago

Very appreciate for your reply.