liuwei1206 / LEBERT

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"
336 stars 60 forks source link

training about MSRA dataset #25

Closed MichealPAPA closed 3 years ago

MichealPAPA commented 3 years ago

wcbert_token_file_2021-08-08_12^%53^%46.txt

MichealPAPA commented 3 years ago

多次重复训练,结果相近!

liuwei1206 commented 3 years ago

Hi,

I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+.

Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.

liuwei1206 commented 3 years ago

To use multi-GPU training, you should first read the materials about it.

MichealPAPA commented 3 years ago

Hi,

I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+.

Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.

我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢!

MichealPAPA commented 3 years ago

Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.

我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢! ———————————————————————————————————————————————————— 从95.7下降到92.8的原因,是不是发布的源代码把Bert的fine tuning关掉了。

liuwei1206 commented 3 years ago

Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.

我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢!

Please check the parameters carefully, I saw that the training batch size and epoch in your script were wrong. I strongly recommend you to use the script provided in the MSRA checkpoint.

This code is exactly the copy of the original implementation of my paper, I don't think it's wrong. Besides, the CUDA version does will make the performance difference, but not such big an amplitude.

liuwei1206 commented 3 years ago

Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.

我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢! ———————————————————————————————————————————————————— 从95.7下降到92.8的原因,是不是发布的源代码把Bert的fine tuning关掉了。

Please check it yourself. I believe you will learn a lot from it if you read the code in detail.

MichealPAPA commented 3 years ago

Hi, I think there are something wrong about your training. As we know, even bert-based model can achieve a F1 score of 94.0+. Do you use the shell file in the checkpoint of MSRA? And do you test the performance using the checkpoint provided by me? I recommend you to read the paper in detain and check the training again.

我是直接克隆源代码进行训练,没有改动。配置参数也是按照论文设置的。是否是代码发布有不一致的地方?或者cuda或者其他包的版本有要求?谢谢!

Please check the parameters carefully, I saw that the training batch size and epoch in your script were wrong. I strongly recommend you to use the script provided in the MSRA checkpoint.

This code is exactly the copy of the original implementation of my paper, I don't think it's wrong. Besides, the CUDA version does will make the performance difference, but not such big an amplitude.

多谢了!