bojone / bert4keras

keras implement of transformers for humans
https://kexue.fm/archives/6915
Apache License 2.0
5.37k stars 929 forks source link

T5模型如何加载多个decoder #483

Open KyrieIrving24 opened 2 years ago

KyrieIrving24 commented 2 years ago

提问时请尽可能提供如下信息:

基本信息

核心代码

class Multi_decoder(tf.keras.Model):
    def __init__(self, encoder, decoder):
        super().__init__()
        self.encoder = encoder
        self.decoder = decoder

    def call(self, inputs):
        encoder_input, decoder_input = inputs
        encoder_encodings, encoder_masks = self.encoder(encoder_input)
        decoder_outputs = self.decoder([decoder_input, encoder_encodings, encoder_masks])
        return decoder_outputs

输出信息

 Traceback (most recent call last):
  File "call.py", line 175, in <module>
    model.fit(x=[batch_t_token_ids, batch_p_token_ids], y=batch_p_token_ids, batch_size=batch_size, epochs=epochs, callbacks=[evaluator])
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 727, in fit
    use_multiprocessing=use_multiprocessing)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 643, in fit
    shuffle=shuffle)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2418, in _standardize_user_data
    all_inputs, y_input, dict_inputs = self._build_model_with_inputs(x, y)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2621, in _build_model_with_inputs
    self._set_inputs(cast_inputs)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2708, in _set_inputs
    outputs = self(inputs, **kwargs)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 854, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
    raise e.ag_error_metadata.to_exception(e)
NameError: in converted code:

    call.py:23 call  *
        encoder_encodings, encoder_masks = self.encoder(encoder_input)
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/engine/base_layer.py:506 __call__  *
        output_shape = self.compute_output_shape(input_shape)
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/engine/network.py:656 compute_output_shape  *
        output_shape = layer.compute_output_shape(
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/layers/merge.py:173 compute_output_shape  *
        output_shape = self._compute_elemwise_op_output_shape(output_shape,
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/layers/merge.py:50 _compute_elemwise_op_output_shape  *
        for i, j in zip(shape1[-len(shape2):], shape2):
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:339 for_stmt
        return _py_for_stmt(iter_, extra_test, body, get_state, set_state, init_vars)
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:348 _py_for_stmt
        if extra_test is not None and not extra_test(*state):
    /var/folders/v_/m84qz0751dv95zzxzwll27840000gp/T/tmpnbr9tvxs.py:158 extra_test
        return ag__.not_(do_return_2)

    NameError: free variable 'do_return_2' referenced before assignment in enclosing scope

自我尝试

您好,我想尝试基于T5的多个decoder,也就是将T5拆解开,decoder复制多个。思路是通过build_transformer_model加载多个decoder,目前还是单个decoder,这样就已经跑不通了。通过这样的代码实现方式能否实现呢

bojone commented 2 years ago

看错误信息,似乎跟模型实现没有关系?