Closed MichealPAPA closed 3 years ago
多次重复训练,结果相近!
Hi,
I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+.
Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.
To use multi-GPU training, you should first read the materials about it.
Hi,
I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+.
Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.
我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢!
Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.
我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢! ———————————————————————————————————————————————————— 从95.7下降到92.8的原因,是不是发布的源代码把Bert的fine tuning关掉了。
Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.
我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢!
Please check the parameters carefully, I saw that the training batch size and epoch in your script were wrong. I strongly recommend you to use the script provided in the MSRA checkpoint.
This code is exactly the copy of the original implementation of my paper, I don't think it's wrong. Besides, the CUDA version does will make the performance difference, but not such big an amplitude.
Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.
我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢! ———————————————————————————————————————————————————— 从95.7下降到92.8的原因,是不是发布的源代码把Bert的fine tuning关掉了。
Please check it yourself. I believe you will learn a lot from it if you read the code in detail.
Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.
我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢!
Please check the parameters carefully, I saw that the training batch size and epoch in your script were wrong. I strongly recommend you to use the script provided in the MSRA checkpoint.
This code is exactly the copy of the original implementation of my paper, I don't think it's wrong. Besides, the CUDA version does will make the performance difference, but not such big an amplitude.
多谢了!
wcbert_token_file_2021-08-08_12^%53^%46.txt