Closed Shiina18 closed 1 year ago
Many thanks for the PR!
Actually, there are some post-edits in the predict_one
function to avoid returning arguments containing [UNK]
, which may also result in unexpected results.
So I suggest we NEVER use inference.py
to reproduce results on ChFinAnn (important, 'cause DuEE-Fin results are obtained in a similar way, so this suggestion is available only on ChFinAnn).
Nice PR~ I'll change a little bit and add a notice in the README file. Please stay tuned.
Merging #56 (c9b2f8c) into main (3fe44b4) will decrease coverage by
0.00%
. The diff coverage is50.00%
.
@@ Coverage Diff @@
## main #56 +/- ##
==========================================
- Coverage 20.33% 20.32% -0.01%
==========================================
Files 20 20
Lines 4539 4541 +2
==========================================
Hits 923 923
- Misses 3616 3618 +2
Impacted Files | Coverage Δ | |
---|---|---|
dee/helper/__init__.py | 6.68% <50.00%> (-0.04%) |
:arrow_down: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
I extracted the sent_seg
behaviour out of predict_one
API to make sure there's no more misunderstandings by the Union[str, List[str]]
type.
Thanks for your contribution, Shiina18~
The way to passing a doc in
inference.py
is misleading, and one cannot reproduce the result presented in the paper this way.It can be found that sentences are passed as they are when the model is trained. We should also pass sentences like that during inference time, rather than pass a whole doc and then get it segmented. And in the modified way, the comparable result (f1 0.78~0.79) can be reproduced.