对chatglm3-6b-32k模型转换失败，报错信息如下所示：

Framework not specified. Using pt to export the model. ====Exporting IR=====

Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s] Loading checkpoint shards: 14%|█▍ | 1/7 [00:26<02:36, 26.10s/it] Loading checkpoint shards: 29%|██▊ | 2/7 [00:48<02:00, 24.16s/it] Loading checkpoint shards: 43%|████▎ | 3/7 [01:13<01:38, 24.57s/it] Loading checkpoint shards: 57%|█████▋ | 4/7 [01:41<01:17, 25.85s/it] Loading checkpoint shards: 71%|███████▏ | 5/7 [01:53<00:41, 20.73s/it] Loading checkpoint shards: 86%|████████▌ | 6/7 [02:23<00:23, 23.81s/it] Loading checkpoint shards: 100%|██████████| 7/7 [02:28<00:00, 17.72s/it] Loading checkpoint shards: 100%|██████████| 7/7 [02:28<00:00, 21.20s/it] Using framework PyTorch: 2.2.2+cpu WARNING:root:Cannot apply model.to_bettertransformer because of the exception: The model type chatglm is not yet supported to be used with BetterTransformer. Feel free to open an issue at https://github.com/huggingface/optimum/issues if you would like this model type to be supported. Currently supported models are: dict_keys(['albert', 'bark', 'bart', 'bert', 'bert-generation', 'blenderbot', 'bloom', 'camembert', 'blip-2', 'clip', 'codegen', 'data2vec-text', 'deit', 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2', 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm', 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt', 'pegasus', 'rembert', 'prophetnet', 'roberta', 'roc_bert', 'roformer', 'splinter', 'tapas', 't5', 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2', 'xlm-roberta', 'yolos']).. Usage model with stateful=True may be non-effective if model does not contain torch.functional.scaled_dot_product_attention Overriding 1 configuration item(s)

use_cache -> True /home/wanglaiqi/.cache/huggingface/modules/transformers_modules/chatglm3-6b-32k/modeling_chatglm.py:821: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if (attention_mask is not None and not attention_mask.all()) or (past_key_values and seq_length != 1): /home/wanglaiqi/.cache/huggingface/modules/transformers_modules/chatglm3-6b-32k/modeling_chatglm.py:687: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if past_length: Export model to OpenVINO directly failed with: Couldn't get TorchScript module by tracing. With exception: The size of tensor a (18) must match the size of tensor b (32) at non-singleton dimension 2 Please check correctness of provided 'example_input'. You can also provide TorchScript module that you obtained yourself, please refer to PyTorch documentation: https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html.. Model will be exported to ONNX [ WARNING ] Making stateful models is not supported when exporting to ONNX as an intermediate step. A stateless model will be exported instead. It may result in sub-optimal inference performance.Provide a model that can be converted to OpenVINO without fallback to ONNX conversion path. Using framework PyTorch: 2.2.2+cpu Overriding 1 configuration item(s)
use_cache -> True Traceback (most recent call last): File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/openvino/frontend/pytorch/ts_decoder.py", line 41, in init pt_module = self._get_scripted_model( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/openvino/frontend/pytorch/ts_decoder.py", line 134, in _get_scripted_model scripted = torch.jit.trace( ^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/jit/_trace.py", line 806, in trace return trace_module( ^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/jit/_trace.py", line 1074, in trace_module module._c._create_method_from_trace( File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 146, in wrapped return module_call(self, *args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward result = self.forward(*input, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/optimum/exporters/openvino/convert.py", line 366, in ts_patched_forward outputs = patched_forward(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/optimum/exporters/onnx/model_patcher.py", line 152, in patched_forward outputs = self.orig_forward(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/wanglaiqi/.cache/huggingface/modules/transformers_modules/chatglm3-6b-32k/modeling_chatglm.py", line 940, in forward transformer_outputs = self.transformer( ^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 146, in wrapped return module_call(self, *args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/wanglaiqi/miniconda3/envs/openvino/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward result = self.forward(*input, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/wanglaiqi/.cache/huggingface/modules/transformers_modules/chatglm3-6b-32k/modeling_chatglm.py", line 822, in forward full_attention_mask = self.get_masks(input_ids, past_key_values, padding_mask=attention_mask) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/wanglaiqi/.cache/huggingface/modules/transformers_modules/chatglm3-6b-32k/modeling_chatglm.py", line 691, in get_masks full_attention_mask = full_attention_mask padding_mask.unsqueeze(1)
```
RuntimeError: The size of tensor a (18) must match the size of tensor b (32) at non-singleton dimension 2
```

During handling of the above exception, another exception occurred:

openvino-dev-samples / chatglm3.openvino

对chatglm3-6b-32k模型转换失败，报错信息如下所示： #16