bojone / bert4keras

keras implement of transformers for humans
https://kexue.fm/archives/6915
Apache License 2.0
5.37k stars 927 forks source link

训练ner, tf2.2 异常, tf1.14 正常 #316

Open hongyinjie opened 3 years ago

hongyinjie commented 3 years ago

提问时请尽可能提供如下信息:

基本信息

核心代码

# 请在此处贴上你的核心代码。
# 请尽量只保留关键部分,不要无脑贴全部代码。

运行代码:

task_sequence_labeling_ner_crf 和 task_sequence_labeling_cws_crf

task_sequence_labeling_cws_crf: 能正常输出 task_sequence_labeling_ner_crf: loss很大

但是我把tf换成1.14 之后, 一切都正常。

tf2.2:

83%|████████▎ | 3839/4636 [00:28<00:05, 135.11it/s] 88%|████████▊ | 4068/4636 [00:30<00:04, 124.61it/s] 93%|█████████▎| 4304/4636 [00:31<00:02, 147.30it/s] 98%|█████████▊| 4541/4636 [00:33<00:00, 128.40it/s] 100%|██████████| 4636/4636 [00:33<00:00, 136.38it/s][[ 394.95874 -176.36295 -7.8401017 548.1664 -215.90518 -320.57907 173.12894 ] [ 50.637447 57.154224 -421.39597 -57.696938 159.27237 -302.50723 546.85077 ] [-131.41158 485.80887 -155.1044 -336.75372 -342.46317 -105.33603 256.02954 ] [-246.23201 -1.0806776 -299.04547 -383.3489 -103.078354 -501.77563 -524.0191 ] [ 153.22589 443.8938 372.6826 -357.67694 409.4358 66.46063 -449.94702 ] [ 152.81165 256.4578 446.38495 -156.91025 523.1699 28.664707 -191.63524 ] [ 555.72314 -103.25582 143.87459 513.51495 -371.6807 32.97588 -184.39626 ]] valid: f1: 0.04041, precision: 0.03082, recall: 0.05864, best f1: 0.04463

test: f1: 0.04030, precision: 0.03051, recall: 0.05932

Epoch 10/20

1/652 [..............................] - ETA: 4:10 - loss: 3169.1255 - sparse_accuracy: 0.9567 2/652 [..............................] - ETA: 5:15 - loss: 3372.9714 - sparse_accuracy: 0.9579 3/652 [..............................] - ETA: 5:23 - loss: 3111.1815 - sparse_accuracy: 0.9601 4/652 [..............................] - ETA: 4:57 - loss: 3079.9089 - sparse_accuracy: 0.9564 5/652 [..............................] - ETA: 4:27 - loss: 3079.6596 - sparse_accuracy: 0.9586 6/652 [..............................] - ETA: 4:12 - loss: 3119.9171 - sparse_accuracy: 0.9561

tf1.14:

100%|██████████| 4636/4636 [01:16<00:00, 60.44it/s][[ 7.16037631e-01 5.14450431e-01 -1.78328049e+00 1.73756212e-01 -2.75914931e+00 6.65736020e-01 -2.37245560e+00] [-9.04985428e-01 -6.50179565e-01 1.11518121e+00 -3.27171385e-01 -7.40627885e-01 -4.65121299e-01 -6.89429283e-01] [-4.93433028e-01 1.09565500e-02 7.67034054e-01 -1.39231846e-01 -7.43359983e-01 -6.92604780e-01 -1.65410519e+00] [-2.09137306e-01 -7.32884526e-01 -4.98064905e-01 -8.51817727e-02 2.19344211e+00 -5.92046082e-01 -2.18841100e+00] [-5.77763796e-01 -1.20774755e-04 -1.60643029e+00 -1.29808679e-01 1.52377093e+00 -7.41425693e-01 -2.82630181e+00] [-2.29889274e+00 -3.26770931e-01 -1.05772102e+00 -6.81385100e-01 -2.83614874e+00 -1.47397673e+00 1.81358981e+00] [-1.79576516e+00 -9.83837903e-01 -1.08787990e+00 -1.34428167e+00 -2.87789464e+00 -2.97322130e+00 1.64370608e+00]] valid: f1: 0.94411, precision: 0.94317, recall: 0.94504, best f1: 0.94411

test: f1: 0.93164, precision: 0.92686, recall: 0.93647

Epoch 2/10

1/1304 [..............................] - ETA: 5:45 - loss: 0.4944 - sparse_accuracy: 0.9839 2/1304 [..............................] - ETA: 5:49 - loss: 0.5252 - sparse_accuracy: 0.9852 3/1304 [..............................] - ETA: 6:29 - loss: 0.7188 - sparse_accuracy: 0.9841 4/1304 [..............................] - ETA: 7:08 - loss: 0.6739 - sparse_accuracy: 0.9841 5/1304 [..............................] - ETA: 12:43 - loss: 0.6096 - sparse_accuracy: 0.9835 6/1304 [..............................] - ETA: 13:02 - loss: 0.5395 - sparse_accuracy: 0.9849 7/1304 [..............................] - ETA: 12:29 - loss: 0.5447 - sparse_accuracy: 0.9829

自我尝试

不管什么问题,请先尝试自行解决,“万般努力”之下仍然无法解决再来提问。此处请贴上你的努力过程。

不知道有没有遇到过的? 谢谢。 还会继续研究, 感觉是某个op不一致。

hongyinjie commented 3 years ago

需要设置tf keras os.environ['TF_KERAS'] = '1' 已经解决

zouweidong91 commented 1 year ago

tf切换为2.3.0 设置os.environ['TF_KERAS'] = '1' 。 自己代码中所有from keras 改为 from tensorflow.keras