HillZhang1999 / NaSGEC

Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)
https://arxiv.org/abs/2305.16023
73 stars 6 forks source link

real_learner_bart_CGEC_exam.pt在MuCGEC test上无法对齐 #11

Open zxyucas opened 9 months ago

zxyucas commented 9 months ago

你好,我这边在real_learner_bart_CGEC_exam.pt上测试MuCGEC test集 F0.5是40.67,为什么对不上论文里的45.68

HillZhang1999 commented 9 months ago

参考MuCGEC的原论文,需要注意是先分句纠正再合并,这一个小trick对模型性能影响比较大。

zxyucas commented 9 months ago

依据MuCGEC中的predict.py里先分句再合并,达到41.28,还是无法对齐论文里的45.68

HillZhang1999 commented 9 months ago

具体precison/recall是多少呢

zxyucas commented 9 months ago

具体precison/recall是多少呢

NaCGEC 参考第一行

HillZhang1999 commented 9 months ago

请问你是什么版本跑的,transformers还是fairseq?

HillZhang1999 commented 9 months ago

论文中的结果是beam search的结果,不是greedy decoding,也需要注意