RealFormer构建出错, 希望提供RealFormer使用测试样例

xsthunder commented 3 years ago

可能是构建模型时，build_transformer_model参数设置有问题，example中没有找到residual_attention_scores的参数使用样例，希望提供

基本信息

你使用的操作系统: ubuntu1~18.04
你使用的Python版本: Python 3.8.5
你使用的Tensorflow版本: tf.version '2.4.0'
你使用的Keras版本: tf.keras.version '2.4.0'
你使用的bert4keras版本: bert4keras.version '0.9.9'
你使用纯keras还是tf.keras: tf.keras
你加载的预训练模型: chinese_roberta_wwm_ext_L-12_H-768_A-12

核心代码

# 请在此处贴上你的核心代码。
# 请尽量只保留关键部分，不要无脑贴全部代码。
from pathlib import Path
bert_weight_path = Path("~/bert/chinese_roberta_wwm_ext_L-12_H-768_A-12")
config_path = str(bert_weight_path/'bert_config.json')
checkpoint_path = str(bert_weight_path/'bert_model.ckpt')
dict_path = str(bert_weight_path/'vocab.txt')
import os
os.environ["CUDA_VISIBLE_DEVICES"] = '2'
os.environ["RECOMPUTE"] = '1'
os.environ["KERAS_BACKEND"] = 'tensorflow'
os.environ['TF_KERAS'] = '1'  # 必须使用tf.keras'
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
from bert4keras.models import build_transformer_model
model = build_transformer_model(
    config_path,
    checkpoint_path,
    application='encoder',
    return_keras_model=False, 
    residual_attention_scores=True
)

输出信息

/usr/local/lib/python3.8/dist-packages/bert4keras/models.py in apply(self, inputs, layer, arguments, **kwargs)
    157                         inputs = inputs[:3] + [a_bias] + inputs[4:]
    158                         arguments['a_bias'] = True
--> 159                     o, a = self.layers[name](inputs, **arguments)
    160                     self.attention_scores = a
    161                     return o

ValueError: Tried to convert 'input' to a tensor and failed. Error: Shapes must be equal rank, but are 3 and 4
    From merging shape 0 with other shapes. for '{{node Transformer-0-MultiHeadSelfAttention/Identity/packed}} = Pack[N=2, T=DT_FLOAT, axis=0](Transformer-0-MultiHeadSelfAttention/mul_1, Transformer-0-MultiHeadSelfAttention/sub_1)' with input shapes: [?,?,768], [?,12,?,?].

自我尝试

不管什么问题，请先尝试自行解决，“万般努力”之下仍然无法解决再来提问。此处请贴上你的努力过程。

尝试1 测试环境

去掉residual_attention_scores=True，成功构建。

尝试2 从图中找算子，动态构建attention weight矩阵旁路

失败，tf1的算子结果tensor，不支持用于keras.Model构建模型。tf1的算子结果tensor只能在session下运行后拿到结果。

att_layer = model.apply(name="Transformer-1-MultiHeadSelfAttention")
softmax_op = att_layer.output.graph.get_operation_by_name("Transformer-0-MultiHeadSelfAttention/Softmax")
tf.keras.Model(su_model.inputs, softmax_op.outputs)


InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: You must feed a value for placeholder tensor 'Input-Token' with dtype float and shape [?,?]
     [[{{node Input-Token}}]]
     [[Transformer-0-MultiHeadSelfAttention/mul/_1183]]
  (1) Invalid argument: You must feed a value for placeholder tensor 'Input-Token' with dtype float and shape [?,?]
     [[{{node Input-Token}}]]
0 successful operations.
0 derived errors ignored.

更换tf1.15

Python 3.6.9 ubuntu1~18.04 tf.keras.version: '2.2.4-tf' tf: '1.15.4' bert4keras: 0.9.9 chinese_roberta_wwm_ext_L-12_H-768_A-12

/usr/local/lib/python3.6/dist-packages/bert4keras/models.py in apply(self, inputs, layer, arguments, **kwargs)
    157                         inputs = inputs[:3] + [a_bias] + inputs[4:]
    158                         arguments['a_bias'] = True
--> 159                     o, a = self.layers[name](inputs, **arguments)
    160                     self.attention_scores = a
    161                     return o

输出信息变更

ValueError: Tried to convert 'input' to a tensor and failed. Error: Shapes must be equal rank, but are 3 and 4
    From merging shape 0 with other shapes. for 'Transformer-0-MultiHeadSelfAttention_1/Identity/packed' (op: 'Pack') with input shapes: [?,?,768], [?,12,?,?].

同样地，去掉residual_attention_scores=True，成功构建。

bojone commented 3 years ago

你说的是realformer吧

我在tf 1.15下，测试keras/tf.keras都能成功跑起上述模型。这说明模型的代码实现是没有问题的。

你有没有试过去掉tf.compat.v1.disable_eager_execution()看能不能成功？

我对tf 2.x比较抗拒，目前不打算针对tf 2.x开发，只能说尽量同时支持一下～

xsthunder commented 3 years ago

去掉tf.compat.v1.disable_eager_execution()无效

先不管，谢谢苏神的快速回答

bojone / bert4keras