Open smartparrot opened 7 months ago
the above solved, now new error:
python llm_export.py --type Qwen-7B-Chat --path /mnt/LLM_Data/Qwen-7B-Chat --export_split --export_token --onnx_path /mnt/LLM_Data/Qwen-7B-Chat-onnx /home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained". Try importing flash-attention for faster inference... Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:06<00:00, 1.26it/s]
Traceback (most recent call last):
File "/home/ubuntu/LLM/llm-export/llm_export.py", line 1264, in
上面问题没了,现在问题是报段错误
看起来是ONNX的问题,onnx使用1.12.0版本试试看
当我运行时: python llm_export.py --type Qwen-7B-Chat --path /mnt/LLM_Data/Qwen-7B-Chat --export_split --export_token --export_mnn --onnx_path /mnt/LLM_Data/Qwen-7B-Chat-onnx --mnn_path /mnt/LLM_Data/Qwen-7B-Chat-mnn
我得到: 加载检查点碎片:100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:06<00:00,1.26it/s]
tiktoken 代币
回溯(最近一次调用最后一次): 文件“/home/ubuntu/LLM/llm-export/llm_export.py”,第 1264 行,在 llm_exporter.export_embed() 文件“/home/ubuntu/LLM/llm-export/llm_export.py”,第 269 行,在 export_embed torch.onnx.export(model, (input_ids), 文件“/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py”,第 516 行,在 export _export( 文件“/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py”,第 1654 行,在 _export 中 )= graph._export_onnx(# 类型: ignore[attr-defined] RuntimeError: ONNX 导出失败。无法打开文件或目录:.//mnt/LLM_Data/Qwen-7B-Chat-onnx/embed.weight
您好,这个问题是如何解决的呢?
When I run: python llm_export.py --type Qwen-7B-Chat --path /mnt/LLM_Data/Qwen-7B-Chat --export_split --export_token --export_mnn --onnx_path /mnt/LLM_Data/Qwen-7B-Chat-onnx --mnn_path /mnt/LLM_Data/Qwen-7B-Chat-mnn
I got: Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:06<00:00, 1.26it/s]
tiktoken tokenier
Traceback (most recent call last): File "/home/ubuntu/LLM/llm-export/llm_export.py", line 1264, in
llm_exporter.export_embed()
File "/home/ubuntu/LLM/llm-export/llm_export.py", line 269, in export_embed
torch.onnx.export(model, (input_ids),
File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export
_export(
File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py", line 1654, in _export
) = graph._export_onnx( # type: ignore[attr-defined]
RuntimeError: ONNX export failed. Could not open file or directory: .//mnt/LLM_Data/Qwen-7B-Chat-onnx/embed.weight