Bert-LSTM-CRF 例子，提示ValueError: For 'MatMul' the input dimensions must be equal, but got 'x1_col': 768 and 'x2_row': 3072.

Describe the bug/ 问题描述 (Mandatory / 必填) A clear and concise description of what the bug is. 使用MindNLP 0.2.3版本运行 Bert-LSTM-CRF 时，提示提示ValueError: For 'MatMul' the input dimensions must be equal, but got 'x1_col': 768 and 'x2_row': 3072.

Hardware Environment(Ascend/GPU/CPU) / 硬件环境: 910A
Software Environment / 软件环境 (Mandatory / 必填): -- MindSpore version (e.g., 1.7.0.Bxxx) : 2.2.0 -- Python version (e.g., Python 3.7.5) : 3.9.18 -- OS platform and distribution (e.g., Linux Ubuntu 16.04): 欧拉2.8 -- GCC/Compiler version (if compiled from source):
Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

/mode pynative 、

To Reproduce / 重现步骤 (Mandatory / 必填) Steps to reproduce the behavior: 1.安装所有MindNLP所需依赖 2.进入mindnlp/examples/sequence_labeling 3.打开Bert-LSTM-CRF.ipynb 4.依次执行 5.执行到trainer.run(tgt_columns="labels")时

`Cell In[11], line 1 ----> 1 trainer.run(tgt_columns="labels")

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/base.py:200, in Trainer.run(self, tgt_columns) 197 run_context = RunContext(args_dict) 198 self.callback_manager.train_begin(run_context) --> 200 self._run(run_context, tgt_columns) 201 self.callback_manager.train_end(run_context)

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/base.py:238, in Trainer._run(self, run_context, tgt_columns) 236 self.callback_manager.train_step_begin(run_context) 237 if self.obj_network: --> 238 loss = self.train_fn(data) 239 else: 240 loss = self.train_fn(tgts, data)

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/utils.py:85, in get_default_train_step_fn..default_run_step_for_obj_net(*args, *kwargs) 83 status = init_status() 84 args = ops.depend(args, status) ---> 85 loss, grads = grad_fn(args, **kwargs) 86 loss = loss_scaler.unscale(loss) 87 if check_gradients:

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py:625, in _Grad.call..after_grad(*args, kwargs) 624 def aftergrad(*args, **kwargs): --> 625 return grad(fn_, weights)(*args, kwargs)

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/common/api.py:121, in _wrap_func..wrapper(*arg, kwargs) 119 @wraps(fn) 120 def wrapper(*arg, *kwargs): --> 121 results = fn(arg, kwargs) 122 return _convert_python_data(results)

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py:600, in _Grad.call..after_grad(*args, kwargs) 598 @_wrap_func 599 def after_grad(*args, *kwargs): --> 600 res = self._pynative_forwardrun(fn, grad, weights, args, kwargs) 601 _pynativeexecutor.grad(fn, grad, weights, grad_position, args, kwargs) 602 out = _pynative_executor()

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py:650, in _Grad._pynative_forward_run(self, fn, grad, weights, args, kwargs) 648 _pynative_executor.set_grad_flag(True) 649 _pynative_executor.new_graph(fn, *args, new_kwargs) --> 650 outputs = fn(*args, *new_kwargs) 651 _pynative_executor.end_graph(fn, outputs, args, new_kwargs) 652 return outputs

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/utils.py:49, in get_default_forward_fn_without_loss_fn..forward_fn(*args, kwargs) 47 def forward_fn(*args, *kwargs): 48 outputs_list = () ---> 49 outputs = network(args, kwargs) 50 if isinstance(outputs, tuple): 51 outputs_list += outputs

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py:705, in Cell.call(self, *args, **kwargs) 703 except Exception as err: 704 _pynative_executor.clear_res() --> 705 raise err 707 if isinstance(output, Parameter): 708 output = output.data

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py:701, in Cell.call(self, *args, kwargs) 699 try: 700 _pynative_executor.new_graph(self, *args, *kwargs) --> 701 output = self._run_construct(args, kwargs) 702 _pynative_executor.end_graph(self, output, args, kwargs) 703 except Exception as err:

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/nn/cell.py:482, in Cell._run_construct(self, cast_inputs, kwargs) 480 output = self._shard_fn(*cast_inputs, *kwargs) 481 else: --> 482 output = self.construct(cast_inputs, **kwargs) 483 if self._enable_forward_hook: 484 output = self._run_forward_hook(cast_inputs, output)

Cell In[6], line 13, in Bert_LSTM_CRF.construct(self, ids, seq_length, labels) 11 def construct(self, ids, seq_length=None, labels=None): 12 attention_mask = (ids > mindspore.tensor(0)) ---> 13 output = self.bert_model(input_ids=ids, attention_mask=attention_mask) 14 lstmfeat, = self.bilstm(output[0]) 15 emissions = self.crf_hidden_fc(lstm_feat)

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/bert/modeling_bert.py:802, in BertModel.construct(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict) 792 head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers) 794 embedding_output = self.embeddings( 795 input_ids=input_ids, 796 position_ids=position_ids, (...) 799 past_key_values_length=past_key_values_length, 800 ) --> 802 encoder_outputs = self.encoder( 803 embedding_output, 804 attention_mask=extended_attention_mask, 805 head_mask=head_mask, 806 encoder_hidden_states=encoder_hidden_states, 807 encoder_attention_mask=encoder_extended_attention_mask, 808 past_key_values=past_key_values, 809 use_cache=use_cache, 810 output_attentions=output_attentions, 811 output_hidden_states=output_hidden_states, 812 return_dict=return_dict, 813 ) 815 sequence_output = encoder_outputs[0] 816 pooled_output = self.pooler(sequence_output) if self.pooler is not None else None

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/bert/modeling_bert.py:525, in BertEncoder.construct(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict) 523 layer_head_mask = head_mask[i] if head_mask is not None else None 524 past_key_value = past_key_values[i] if past_key_values is not None else None --> 525 layer_outputs = layer_module( 526 hidden_states, 527 attention_mask, 528 layer_head_mask, 529 encoder_hidden_states, 530 encoder_attention_mask, 531 past_key_value, 532 output_attentions, 533 ) 534 hidden_states = layer_outputs[0] 535 if use_cache:

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/bert/modeling_bert.py:474, in BertLayer.construct(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, past_key_value, output_attentions) 471 cross_attn_present_key_value = cross_attention_outputs[-1] 472 present_key_value = present_key_value + cross_attn_present_key_value --> 474 layer_output = apply_chunking_to_forward( 475 self.feed_forward_chunk, self.chunk_size_feed_forward, self.seq_len_dim, attention_output 476 ) 477 outputs = (layer_output,) + outputs 479 # if decoder, return the attn key/values as the last output

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/ms_utils.py:188, in apply_chunking_to_forward(forward_fn, chunk_size, chunk_axis, input_tensors) 185 # concatenate output at same dimension 186 return ops.cat(output_chunks, axis=chunk_axis) --> 188 return forward_fn(input_tensors)

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/bert/modeling_bert.py:488, in BertLayer.feed_forward_chunk(self, attention_output) 486 """feed forward chunk""" 487 intermediate_output = self.intermediate(attention_output) --> 488 layer_output = self.output(intermediate_output, attention_output) 489 return layer_output

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/bert/modeling_bert.py:397, in BertOutput.construct(self, hidden_states, input_tensor) 395 hidden_states = self.dense(hidden_states) 396 hidden_states = self.dropout(hidden_states) --> 397 hidden_states = self.LayerNorm(hidden_states + input_tensor) 398 return hidden_states

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/injection.py:789, in LayerNorm.construct(self, input_x) 787 def construct(self, input_x): 788 if self.elementwiseaffine: --> 789 y, , _ = self.layer_norm(input_x, self.weight.astype(input_x.dtype), self.bias.astype(inputx.dtype)) 790 else: 791 y, , _ = self.layer_norm(input_x, ops.ones(self.normalized_shape, input_x.dtype), 792 ops.zeros(self.normalized_shape, input_x.dtype),)

File ~/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/common/_stub_tensor.py:94, in StubTensor.dtype(self) 92 if self.stub: 93 if not hasattr(self, "stub_dtype"): ---> 94 self.stub_dtype = self.stub.get_dtype() 95 return self.stub_dtype 96 return self.tensor.dtype

ValueError: For 'MatMul' the input dimensions must be equal, but got 'x1_col': 768 and 'x2_row': 3072.

Ascend Warning Message:

W49999: If want to reuse binary file, please donwload binary file and install first![FUNC:BuildFusionOp][FILE:fusion_manager.cc][LINE:4254] W11001: Op [DropOutGenMask] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGrad/dgateReshapeNode] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGrad/DynamicRNNGraddxReshapeNode] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradBody/DynamicRNNGradbodyDgateReshapeNode] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradBody/DynamicRNNGradbodyDxReshapeNode] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_2_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_3_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_4_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_7_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_8_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_9_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_10_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_12_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance. W11001: Op [DynamicRNNGradWhile_Op_input_11_Memcpy] does not hit the high-priority operator information library, which might result in compromised performance.

C++ Call Stack: (For framework developers)

mindspore/core/ops/mat_mul.cc:107 InferShape`

mindspore-lab / mindnlp

Bert-LSTM-CRF 例子，提示ValueError: For 'MatMul' the input dimensions must be equal, but got 'x1_col': 768 and 'x2_row': 3072. #986

Ascend Warning Message:

C++ Call Stack: (For framework developers)