Spico197 / DocEE

🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.
https://doc-ee.readthedocs.io/
MIT License
232 stars 36 forks source link

Fix the inference issue #55 #56

Closed Shiina18 closed 1 year ago

Shiina18 commented 1 year ago

The way to passing a doc in inference.py is misleading, and one cannot reproduce the result presented in the paper this way.

It can be found that sentences are passed as they are when the model is trained. We should also pass sentences like that during inference time, rather than pass a whole doc and then get it segmented. And in the modified way, the comparable result (f1 0.78~0.79) can be reproduced.

Spico197 commented 1 year ago

Many thanks for the PR!

Actually, there are some post-edits in the predict_one function to avoid returning arguments containing [UNK], which may also result in unexpected results. So I suggest we NEVER use inference.py to reproduce results on ChFinAnn (important, 'cause DuEE-Fin results are obtained in a similar way, so this suggestion is available only on ChFinAnn).

Nice PR~ I'll change a little bit and add a notice in the README file. Please stay tuned.

codecov[bot] commented 1 year ago

Codecov Report

Merging #56 (c9b2f8c) into main (3fe44b4) will decrease coverage by 0.00%. The diff coverage is 50.00%.

@@            Coverage Diff             @@
##             main      #56      +/-   ##
==========================================
- Coverage   20.33%   20.32%   -0.01%     
==========================================
  Files          20       20              
  Lines        4539     4541       +2     
==========================================
  Hits          923      923              
- Misses       3616     3618       +2     
Impacted Files Coverage Δ
dee/helper/__init__.py 6.68% <50.00%> (-0.04%) :arrow_down:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

Spico197 commented 1 year ago

I extracted the sent_seg behaviour out of predict_one API to make sure there's no more misunderstandings by the Union[str, List[str]] type.

Spico197 commented 1 year ago

Thanks for your contribution, Shiina18~