Closed ZihaoZheng98 closed 2 years ago
您好,fusion_output主要用在BertIntermediate
中进行不同模态信息的融合,modeling_unimo.py的第435行有关于fusion_output为None时的判断:
if fusion_output is not None:
fusion_states = self.fusion_dense(fusion_output)
hidden_states = hidden_states + fusion_states
而BertIntermediate
是通过apply_chunking_to_forward
方法进行调用的,这个方法是transformers内部方法,推测是由于您的transformers的版本与我们的不同导致报错,请检查您的transformers的版本是否是4.11.3。
抱歉,刚发现提供的requirement.txt中存在两个不同的transformers版本,已修正。
@Flow3rDown你好,很感谢你的解答,我查看了transformers4.11.3的代码,逻辑和我用的4.8一样
def apply_chunking_to_forward(
forward_fn: Callable[..., torch.Tensor], chunk_size: int, chunk_dim: int, *input_tensors
) -> torch.Tensor:
"""
This function chunks the :obj:`input_tensors` into smaller input tensor parts of size :obj:`chunk_size` over the
dimension :obj:`chunk_dim`. It then applies a layer :obj:`forward_fn` to each chunk independently to save memory.
If the :obj:`forward_fn` is independent across the :obj:`chunk_dim` this function will yield the same result as
directly applying :obj:`forward_fn` to :obj:`input_tensors`.
Args:
forward_fn (:obj:`Callable[..., torch.Tensor]`):
The forward function of the model.
chunk_size (:obj:`int`):
The chunk size of a chunked tensor: :obj:`num_chunks = len(input_tensors[0]) / chunk_size`.
chunk_dim (:obj:`int`):
The dimension over which the :obj:`input_tensors` should be chunked.
input_tensors (:obj:`Tuple[torch.Tensor]`):
The input tensors of ``forward_fn`` which will be chunked
Returns:
:obj:`torch.Tensor`: A tensor with the same shape as the :obj:`forward_fn` would have given if applied`.
Examples::
# rename the usual forward() fn to forward_chunk()
def forward_chunk(self, hidden_states):
hidden_states = self.decoder(hidden_states)
return hidden_states
# implement a chunked forward function
def forward(self, hidden_states):
return apply_chunking_to_forward(self.forward_chunk, self.chunk_size_lm_head, self.seq_len_dim, hidden_states)
"""
assert len(input_tensors) > 0, f"{input_tensors} has to be a tuple/list of tensors"
# inspect.signature exist since python 3.5 and is a python method -> no problem with backward compatibility
num_args_in_forward_chunk_fn = len(inspect.signature(forward_fn).parameters)
if num_args_in_forward_chunk_fn != len(input_tensors):
raise ValueError(
f"forward_chunk_fn expects {num_args_in_forward_chunk_fn} arguments, but only {len(input_tensors)} input "
"tensors are given"
)
if chunk_size > 0:
tensor_shape = input_tensors[0].shape[chunk_dim]
for input_tensor in input_tensors:
if input_tensor.shape[chunk_dim] != tensor_shape:
raise ValueError(
f"All input tenors have to be of the same shape: {tensor_shape}, "
f"found shape {input_tensor.shape[chunk_dim]}"
)
if input_tensors[0].shape[chunk_dim] % chunk_size != 0:
raise ValueError(
f"The dimension to be chunked {input_tensors[0].shape[chunk_dim]} has to be a multiple of the chunk "
f"size {chunk_size}"
)
num_chunks = input_tensors[0].shape[chunk_dim] // chunk_size
# chunk input tensor into tuples
input_tensors_chunks = tuple(input_tensor.chunk(num_chunks, dim=chunk_dim) for input_tensor in input_tensors)
# apply forward fn to every tuple
output_chunks = tuple(forward_fn(*input_tensors_chunk) for input_tensors_chunk in zip(*input_tensors_chunks))
# concatenate output at same dimension
return torch.cat(output_chunks, dim=chunk_dim)
return forward_fn(*input_tensors)
看这个apply_处理代码的逻辑中, if chunk_size >0 的逻辑块中,首先对每个input tensor进行循环,其中就用到了.shape,这也是我这里报None没有shape属性的原因。运行你刚刚说的那部分代码是在这个循环之后的,但是这个循环里就报错了。
self_attention_outputs, fusion_output, qks = self.attention(
hidden_states,
attention_mask,
head_mask,
output_attentions=output_attentions,
visual_hidden_state=visual_hidden_state,
output_qks=output_qks,
current_layer=current_layer,
)
attention_output = self_attention_outputs[0]
outputs = self_attention_outputs[1:] # add self attentions if we output attention weights
layer_output = apply_chunking_to_forward(
self.feed_forward_chunk, self.chunk_size_feed_forward, self.seq_len_dim, attention_output, fusion_output
)
这里fusion_outputs是None,然后进入apply_chunking_to_forward,然后在if chunk_size > 0的逻辑判断部分报错? 不知道是不是我理解错了这个代码的逻辑结构?
transformers4.11.3版本在if chunk_size > 0
里面调用的tensor shape,而transformers4.8.0是在if chunk_size > 0
之外调用的。Bert模型中chunk_size默认是0,所以在transformers4.11.3版本中不会进入if chunk_size > 0
,就不会碰到shape的这个问题了。
好嘞 我现在已经能正常运行了 感谢你的耐心解答!
Hi你好,我在跑MRE的部分时代码报了错误, Traceback (most recent call last): File "run.py", line 153, in
main()
File "run.py", line 147, in main
trainer.train()
File "/users5/zhzheng/MKGformer-main/MRE/modules/train.py", line 54, in train
(loss, logits), labels = self._step(batch, mode="train")
File "/users5/zhzheng/MKGformer-main/MRE/modules/train.py", line 172, in _step
outputs = self.model(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids, labels=labels, images=images, aux_imgs=aux_imgs, rcnn_imgs=rcnn_imgs)
File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, kwargs)
File "/users5/zhzheng/MKGformer-main/MRE/models/unimo_model.py", line 72, in forward
return_dict=True,)
File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 721, in forward
return_dict=return_dict,
File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 620, in forward
current_layer=idx,
File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 548, in forward
self.feed_forward_chunk, self.chunk_size_feed_forward, self.seq_len_dim, attention_output, fusion_output
File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1974, in apply_chunking_to_forward
input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors
File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1974, in
input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors
AttributeError: 'NoneType' object has no attribute 'shape'
我看了一下,应该是由于
modeling_unimo.py的532行附近,
其中fusion output为None,这个是由于visual_hidden_state为None导致的。 但是由于UniEncoder的代码逻辑,
,第9-12层传入的visual_hidden_state为空,这就导致了报错,能帮我看一下这个怎么解决吗,非常感谢~