实验结果较差 - Githubissues

weitajinjucha commented 2 years ago

请问您用这一份代码的实验结果大概是多少呢？我用--batch_size 32 --logging_steps 500 --save_steps 2000 --epochs 10 --learning_rate 5e-5 --max_seq_length 128 最后最好的实验结果： Eval: Sentence-Level Performance: Detection metric: F1=0.7307, Recall=0.7008, Precision=0.7633 Correction metric: F1=0.7058, Recall=0.6639, Precision=0.7534 Save best model at 66000 step.

我看原文里面Detection F1有八十多，因为ernie模型的差异就差七八个点感觉有点奇怪

orangetwo commented 2 years ago

我没跑因为ernie预训练模型没有放出来你跑的时候代码作了修改吗

weitajinjucha commented 2 years ago

我没跑因为ernie预训练模型没有放出来你跑的时候代码作了修改吗

我修改了一些transformers中的config设置，其他没改

macanv commented 2 years ago

@Zhouyuhao97 ，请问你这个结果是那个数据集上跑出来的。

anshen666 commented 1 year ago

请问您用这一份代码的实验结果大概是多少呢？我用--batch_size 32 --logging_steps 500 --save_steps 2000 --epochs 10 --learning_rate 5e-5 --max_seq_length 128 最后最好的实验结果： Eval: Sentence-Level Performance: Detection metric: F1=0.7307, Recall=0.7008, Precision=0.7633 Correction metric: F1=0.7058, Recall=0.6639, Precision=0.7534 Save best model at 66000 step.

我看原文里面Detection F1有八十多，因为ernie模型的差异就差七八个点感觉有点奇怪

你好，我在运行ernie-csc这个代码的时候，报了这个错 OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index', 'flax_model.msgpack'] found in directory ernie or from_tf and from_flax set to False. 请问你知道要怎么解决吗？或者您方便留个联系方式吗

shirain-he commented 9 months ago

我分别跑了ernie1.0,ernie3.0和ernie-gram,结果分别如下： ernie1.0: Sentence-Level Performance: Detection metric: F1=0.7262449528936744, Recall=0.6642048252092565, Precision=0.8010688836104513 Correction metric: F1=0.687742651136994, Recall=0.6105366814377154, Precision=0.7873015873015873

ernie3.0: Sentence-Level Performance: Detection metric: F1=0.6717277486910995, Recall=0.6317085179714427, Precision=0.7171604248183343 Correction metric: F1=0.6490947816826411, Recall=0.6001969473165928, Precision=0.7066666666666667

ernie-gram还没跑完,但是我看到原文好像是sighan13,14,15分开测试，而这个代码好像是把三个整理成一个test数据集了？所以不能简单的对比（但基本上跟原文mlm_phenetic有差距），毕竟我们用的不是他的pre_train。

shirain-he commented 9 months ago

我分别跑了ernie1.0,ernie3.0和ernie-gram,结果分别如下： ernie1.0: Sentence-Level Performance: Detection metric: F1=0.7262449528936744, Recall=0.6642048252092565, Precision=0.8010688836104513 Correction metric: F1=0.687742651136994, Recall=0.6105366814377154, Precision=0.7873015873015873

ernie3.0: Sentence-Level Performance: Detection metric: F1=0.6717277486910995, Recall=0.6317085179714427, Precision=0.7171604248183343 Correction metric: F1=0.6490947816826411, Recall=0.6001969473165928, Precision=0.7066666666666667

ernie-gram还没跑完,但是我看到原文好像是sighan13,14,15分开测试，而这个代码好像是把三个整理成一个test数据集了？所以不能简单的对比（但基本上跟原文mlm_phenetic有差距），毕竟我们用的不是他的pre_train。

这是ernie2.0的结果。10个epoch好像有点少，并没有收敛 global step 86000, epoch: 9, batch: 6854, loss: 0.3496193587779999, speed: 5.263427433808946 step/s Sentence-Level Performance: Detection metric: F1=0.7129084092126406, Recall=0.65534219596258, Precision=0.7815619495008808 Correction metric: F1=0.6840087623220154, Recall=0.6149679960610537, Precision=0.770512029611351 Save best model at 86000 step.

orangetwo / ernie-csc

实验结果较差 #2