smartparrot commented 7 months ago

When I run: python llm_export.py --type Qwen-7B-Chat --path /mnt/LLM_Data/Qwen-7B-Chat --export_split --export_token --export_mnn --onnx_path /mnt/LLM_Data/Qwen-7B-Chat-onnx --mnn_path /mnt/LLM_Data/Qwen-7B-Chat-mnn

I got: Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:06<00:00, 1.26it/s]

tiktoken tokenier

Traceback (most recent call last): File "/home/ubuntu/LLM/llm-export/llm_export.py", line 1264, in llm_exporter.export_embed() File "/home/ubuntu/LLM/llm-export/llm_export.py", line 269, in export_embed torch.onnx.export(model, (input_ids), File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export _export( File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py", line 1654, in _export ) = graph._export_onnx( # type: ignore[attr-defined] RuntimeError: ONNX export failed. Could not open file or directory: .//mnt/LLM_Data/Qwen-7B-Chat-onnx/embed.weight

smartparrot commented 7 months ago

the above solved, now new error:

python llm_export.py --type Qwen-7B-Chat --path /mnt/LLM_Data/Qwen-7B-Chat --export_split --export_token --onnx_path /mnt/LLM_Data/Qwen-7B-Chat-onnx /home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained". Try importing flash-attention for faster inference... Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:06<00:00, 1.26it/s]

tiktoken tokenier

Traceback (most recent call last): File "/home/ubuntu/LLM/llm-export/llm_export.py", line 1264, in llm_exporter.export_embed() File "/home/ubuntu/LLM/llm-export/llm_export.py", line 280, in export_embed slim(onnx_model, output_model=onnx_model) File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/onnxslim/cli/_main.py", line 124, in slim model = optimize(model, skip_fusion_patterns) File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/onnxslim/core/slim.py", line 259, in optimize model = optimize_model(graph, skip_fusion_patterns) File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/onnxslim/core/optimizer.py", line 933, in optimize_model model = gs.export_onnx(graph) File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/onnxslim/onnx_graphsurgeon/exporters/onnx_exporter.py", line 172, in export_onnx onnx_graph = OnnxExporter.export_graph(graph, do_type_check=do_type_check) File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/onnxslim/onnx_graphsurgeon/exporters/onnx_exporter.py", line 148, in export_graph return onnx.helper.make_graph( File "/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/onnx/helper.py", line 234, in make_graph graph.initializer.extend(initializer) google.protobuf.message.DecodeError: Error parsing message

smartparrot commented 7 months ago

上面问题没了，现在问题是报段错误

wangzhaode commented 6 months ago

看起来是ONNX的问题，onnx使用1.12.0版本试试看

lowrance0118 commented 5 months ago

当我运行时： python llm_export.py --type Qwen-7B-Chat --path /mnt/LLM_Data/Qwen-7B-Chat --export_split --export_token --export_mnn --onnx_path /mnt/LLM_Data/Qwen-7B-Chat-onnx --mnn_path /mnt/LLM_Data/Qwen-7B-Chat-mnn

我得到：加载检查点碎片：100％|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:06<00:00，1.26it/s]

tiktoken 代币

回溯（最近一次调用最后一次）：文件“/home/ubuntu/LLM/llm-export/llm_export.py”，第 1264 行，在 llm_exporter.export_embed() 文件“/home/ubuntu/LLM/llm-export/llm_export.py”，第 269 行，在 export_embed torch.onnx.export(model, (input_ids), 文件“/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py”，第 516 行，在 export _export( 文件“/home/ubuntu/anaconda3/envs/modelscope/lib/python3.10/site-packages/torch/onnx/utils.py”，第 1654 行，在 _export 中）= graph._export_onnx(# 类型： ignore[attr-defined] RuntimeError: ONNX 导出失败。无法打开文件或目录：.//mnt/LLM_Data/Qwen-7B-Chat-onnx/embed.weight

您好，这个问题是如何解决的呢？

wangzhaode / llm-export

convert qwen-7b-chat , failed #35

tiktoken tokenier

tiktoken tokenier

tiktoken 代币