bojone / bert4keras

keras implement of transformers for humans

https://kexue.fm/archives/6915

Apache License 2.0

5.37k stars 929 forks source link

Open zhengwsh opened 4 years ago

zhengwsh commented 4 years ago

请问在bert4keras实现中，能否支持像pytorch的fp16混合精度训练呢，是否可能？

bojone commented 4 years ago

曾经探索过，还未成功。后面我再尝试尝试。

zhys513 commented 3 years ago

import tensorflow as tf opt = tf.keras.optimizers.Adam(LR)

add a line 混合精度训练

opt = tf.train.experimental.enable_mixed_precision_graph_rewrite( opt, loss_scale='dynamic')

model.compile( loss='sparse_categorical_crossentropy', optimizer=opt, # 用足够小的学习率

metrics=['accuracy'],

) 以上代码供参考

Atakey commented 3 years ago

曾经探索过，还未成功。后面我再尝试尝试。

将layers文件中继承Layer类的所有调用self.add_weight 方法的地方增加参数 dtype=self.dtype

在tf2.1-2.3版本测试可以开启混合精度训练，不过好像部分模型会出现loss为nan就是了