bojone / bert4keras

keras implement of transformers for humans
https://kexue.fm/archives/6915
Apache License 2.0
5.36k stars 927 forks source link

关于复用keras_bert训练后的模型 #514

Closed mianzhiwj closed 1 year ago

mianzhiwj commented 1 year ago

在使用keras_bert加载预训练权重,已进行过训练或微调之后,如果要在bert4keras模块下复用keras_bert模型权重时,发现结果不一致的问题,经过反复测试后发现,是两边LayerNorm中gamma和beta 参数的顺序不一致。你可以通过调换gamma和beta的位置来加载weights, 比如

def weight_mapping(weights): #调换 gamma和beta参数位置 return_weights = [] gamma = None for w in weights: if 'gamma' in w.name: gamma = w elif 'beta' in w.name: return_weights.append(w) assert gamma is not None return_weights.append(gamma) gamma = None else: return_weights.append(w) return return_weights

bert4keras_model.set_weights(K.batch_get_value(weight_mapping(keras_bert_model.weights)))