gitabtion / BertBasedCorrectionModels

PyTorch impelementations of BERT-based Spelling Error Correction Models. 基于BERT的文本纠错模型,使用PyTorch实现。
Apache License 2.0
265 stars 43 forks source link

cannot Reproduce the result #17

Closed leon2milan closed 3 years ago

leon2milan commented 3 years ago

I fellow the steps. And get different result.

Epoch 9: 100%|█████████████████████████████████████████████████████████████████████████| 199/199 [00:55<00:00, 3.56it/s, loss=0.103, v_num=1] /home/dell/workspace/jiangbingyu/correction/checkpoints/SoftMaskedBert/epoch=09-val_loss=0.13123.ckpt
Testing: 0it [00:00, ?it/s]2021-09-08 23:47:58,342 SoftMaskedBertModel INFO: Testing... Testing: 97%|█████████████████████████████████████████████████████████████████████████████████████████████▏ | 67/69 [00:03<00:00, 18.43it/s] 2021-09-08 23:48:02,103 SoftMaskedBertModel INFO: Test. 2021-09-08 23:48:02,105 SoftMaskedBertModel INFO: loss: 0.08779423662285873 2021-09-08 23:48:02,105 SoftMaskedBertModel INFO: Detection: acc: 0.5000 2021-09-08 23:48:02,106 SoftMaskedBertModel INFO: Correction: acc: 0.6900 2021-09-08 23:48:02,114 SoftMaskedBertModel INFO: The detection result is precision=0.8228782287822878, recall=0.6308345120226309 and F1=0.7141713370696557 2021-09-08 23:48:02,115 SoftMaskedBertModel INFO: The correction result is precision=0.7399103139013453, recall=0.6534653465346535 and F1=0.694006309148265 2021-09-08 23:48:02,116 SoftMaskedBertModel INFO: Sentence Level: acc:0.690000, precision:0.829508, recall:0.466790, f1:0.597403 Testing: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 69/69 [00:03<00:00, 18.27it/s]

DATALOADER:0 TEST RESULTS {'val_loss': 0.08779423662285873}

gitabtion commented 3 years ago

你这训练数据不对吧,训练集怎么才199个batch

guo453585719 commented 3 years ago

hi,我尝试了macbert,使用SIGHAN数据集训练的,没有改参数,也无法复现效果,请问我有哪里做错了吗?我使用SIGHAN数据集训练,训练集只有235个batch

guo453585719 commented 3 years ago

Epoch 9: 100%|██████████████████████████████████████████████████████████████████████████████████| 235/235 [02:02<00:00, 1.91it/s, loss=0.0409, v_num=3] /data/juicefs_translation/11130680/asr_corrector/BertBasedCorrectionModels-master/data/checkpoints/macbert4csc/epoch=09-val_loss=0.05311.ckpt
Testing: 0it [00:00, ?it/s]2021-11-04 19:37:11,769 macbert4csc INFO: Testing... Testing: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 138/138 [00:06<00:00, 22.41it/s]2021-11-04 19:37:18,080 macbert4csc INFO: Test. 2021-11-04 19:37:18,081 macbert4csc INFO: loss: 0.0411618158855624 2021-11-04 19:37:18,081 macbert4csc INFO: Detection: acc: 0.5600 2021-11-04 19:37:18,082 macbert4csc INFO: Correction: acc: 0.6627 2021-11-04 19:37:18,091 macbert4csc INFO: The detection result is precision=0.7776, recall=0.6973 and F1=0.7353 2021-11-04 19:37:18,091 macbert4csc INFO: The correction result is precision=0.6795, recall=0.5971 and F1=0.6357 2021-11-04 19:37:18,093 macbert4csc INFO: Sentence Level: acc:0.6627, precision:0.7785, recall:0.4410, f1:0.5630 Testing: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 138/138 [00:06<00:00, 21.81it/s]

DATALOADER:0 TEST RESULTS {'val_loss': 0.0411618158855624}

gitabtion commented 3 years ago

我是加了wang271k用于训练的,sighan这个数据量太小了

guo453585719 commented 3 years ago

感谢回复!我也试试