zjunlp / MKGformer

[SIGIR 2022] Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion
MIT License
168 stars 28 forks source link

MRE部分报错 (Resolved) #3

Closed ZihaoZheng98 closed 2 years ago

ZihaoZheng98 commented 2 years ago

Hi你好,我在跑MRE的部分时代码报了错误, Traceback (most recent call last): File "run.py", line 153, in main() File "run.py", line 147, in main trainer.train() File "/users5/zhzheng/MKGformer-main/MRE/modules/train.py", line 54, in train (loss, logits), labels = self._step(batch, mode="train") File "/users5/zhzheng/MKGformer-main/MRE/modules/train.py", line 172, in _step outputs = self.model(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids, labels=labels, images=images, aux_imgs=aux_imgs, rcnn_imgs=rcnn_imgs) File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/unimo_model.py", line 72, in forward return_dict=True,) File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 721, in forward return_dict=return_dict, File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 620, in forward current_layer=idx, File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 548, in forward self.feed_forward_chunk, self.chunk_size_feed_forward, self.seq_len_dim, attention_output, fusion_output File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1974, in apply_chunking_to_forward input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1974, in input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors AttributeError: 'NoneType' object has no attribute 'shape' 我看了一下,应该是由于 modeling_unimo.py的532行附近,

`
self_attention_outputs, fusion_output, qks = self.attention(
            hidden_states,
            attention_mask,
            head_mask,
            output_attentions=output_attentions,
            visual_hidden_state=visual_hidden_state,
            output_qks=output_qks,
            current_layer=current_layer,
        )
        attention_output = self_attention_outputs[0]
        outputs = self_attention_outputs[1:]  # add self attentions if we output attention weights
`

其中fusion output为None,这个是由于visual_hidden_state为None导致的。 但是由于UniEncoder的代码逻辑,

# text
            # TODO: 9-12 layers past vison qks to text
            last_hidden_state = vision_hidden_states if idx >= 8 else None
            output_qks = True if idx >= 7 else None
            layer_head_mask = head_mask[idx] if head_mask is not None else None
            text_layer_module = self.text_layer[idx]
            text_layer_output = text_layer_module(
                    text_hidden_states,
                    attention_mask=attention_mask,
                    head_mask=layer_head_mask,
                    visual_hidden_state=last_hidden_state,
                    output_attentions=output_attentions,
                    output_qks=output_qks,
                    current_layer=idx,
            )

,第9-12层传入的visual_hidden_state为空,这就导致了报错,能帮我看一下这个怎么解决吗,非常感谢~

flow3rdown commented 2 years ago

您好,fusion_output主要用在BertIntermediate中进行不同模态信息的融合,modeling_unimo.py的第435行有关于fusion_output为None时的判断:

if fusion_output is not None:
    fusion_states = self.fusion_dense(fusion_output)
    hidden_states = hidden_states + fusion_states

BertIntermediate是通过apply_chunking_to_forward方法进行调用的,这个方法是transformers内部方法,推测是由于您的transformers的版本与我们的不同导致报错,请检查您的transformers的版本是否是4.11.3。

flow3rdown commented 2 years ago

抱歉,刚发现提供的requirement.txt中存在两个不同的transformers版本,已修正。

ZihaoZheng98 commented 2 years ago

@Flow3rDown你好,很感谢你的解答,我查看了transformers4.11.3的代码,逻辑和我用的4.8一样

def apply_chunking_to_forward(
    forward_fn: Callable[..., torch.Tensor], chunk_size: int, chunk_dim: int, *input_tensors
) -> torch.Tensor:
    """
    This function chunks the :obj:`input_tensors` into smaller input tensor parts of size :obj:`chunk_size` over the
    dimension :obj:`chunk_dim`. It then applies a layer :obj:`forward_fn` to each chunk independently to save memory.
    If the :obj:`forward_fn` is independent across the :obj:`chunk_dim` this function will yield the same result as
    directly applying :obj:`forward_fn` to :obj:`input_tensors`.
    Args:
        forward_fn (:obj:`Callable[..., torch.Tensor]`):
            The forward function of the model.
        chunk_size (:obj:`int`):
            The chunk size of a chunked tensor: :obj:`num_chunks = len(input_tensors[0]) / chunk_size`.
        chunk_dim (:obj:`int`):
            The dimension over which the :obj:`input_tensors` should be chunked.
        input_tensors (:obj:`Tuple[torch.Tensor]`):
            The input tensors of ``forward_fn`` which will be chunked
    Returns:
        :obj:`torch.Tensor`: A tensor with the same shape as the :obj:`forward_fn` would have given if applied`.
    Examples::
        # rename the usual forward() fn to forward_chunk()
        def forward_chunk(self, hidden_states):
            hidden_states = self.decoder(hidden_states)
            return hidden_states
        # implement a chunked forward function
        def forward(self, hidden_states):
            return apply_chunking_to_forward(self.forward_chunk, self.chunk_size_lm_head, self.seq_len_dim, hidden_states)
    """

    assert len(input_tensors) > 0, f"{input_tensors} has to be a tuple/list of tensors"

    # inspect.signature exist since python 3.5 and is a python method -> no problem with backward compatibility
    num_args_in_forward_chunk_fn = len(inspect.signature(forward_fn).parameters)
    if num_args_in_forward_chunk_fn != len(input_tensors):
        raise ValueError(
            f"forward_chunk_fn expects {num_args_in_forward_chunk_fn} arguments, but only {len(input_tensors)} input "
            "tensors are given"
        )

    if chunk_size > 0:
        tensor_shape = input_tensors[0].shape[chunk_dim]
        for input_tensor in input_tensors:
            if input_tensor.shape[chunk_dim] != tensor_shape:
                raise ValueError(
                    f"All input tenors have to be of the same shape: {tensor_shape}, "
                    f"found shape {input_tensor.shape[chunk_dim]}"
                )

        if input_tensors[0].shape[chunk_dim] % chunk_size != 0:
            raise ValueError(
                f"The dimension to be chunked {input_tensors[0].shape[chunk_dim]} has to be a multiple of the chunk "
                f"size {chunk_size}"
            )

        num_chunks = input_tensors[0].shape[chunk_dim] // chunk_size

        # chunk input tensor into tuples
        input_tensors_chunks = tuple(input_tensor.chunk(num_chunks, dim=chunk_dim) for input_tensor in input_tensors)
        # apply forward fn to every tuple
        output_chunks = tuple(forward_fn(*input_tensors_chunk) for input_tensors_chunk in zip(*input_tensors_chunks))
        # concatenate output at same dimension
        return torch.cat(output_chunks, dim=chunk_dim)

    return forward_fn(*input_tensors)

看这个apply_处理代码的逻辑中, if chunk_size >0 的逻辑块中,首先对每个input tensor进行循环,其中就用到了.shape,这也是我这里报None没有shape属性的原因。运行你刚刚说的那部分代码是在这个循环之后的,但是这个循环里就报错了。

self_attention_outputs, fusion_output, qks = self.attention(
            hidden_states,
            attention_mask,
            head_mask,
            output_attentions=output_attentions,
            visual_hidden_state=visual_hidden_state,
            output_qks=output_qks,
            current_layer=current_layer,
        )
        attention_output = self_attention_outputs[0]

        outputs = self_attention_outputs[1:]  # add self attentions if we output attention weights

        layer_output = apply_chunking_to_forward(
            self.feed_forward_chunk, self.chunk_size_feed_forward, self.seq_len_dim, attention_output, fusion_output
        )

这里fusion_outputs是None,然后进入apply_chunking_to_forward,然后在if chunk_size > 0的逻辑判断部分报错? 不知道是不是我理解错了这个代码的逻辑结构?

flow3rdown commented 2 years ago

transformers4.11.3版本在if chunk_size > 0里面调用的tensor shape,而transformers4.8.0是在if chunk_size > 0之外调用的。Bert模型中chunk_size默认是0,所以在transformers4.11.3版本中不会进入if chunk_size > 0,就不会碰到shape的这个问题了。

ZihaoZheng98 commented 2 years ago

好嘞 我现在已经能正常运行了 感谢你的耐心解答!