平台(如果交叉编译请再附上交叉编译目标平台): ubuntu 配置：128核132G内存服务器

Platform(Include target platform as well if cross-compiling):

Github版本: a980dba3963efb0ad76b0f3caaf5c21556f69ffe

Github Version:

编译方式:

Compiling Method

python3 llm_export.py \
        --type Qwen2-7B-Instruct \
        --path ~/models/qwen2-7b-instruct \
        --export \
        --export_token \
        --export_embed --embed_bin \
        --export_mnn

编译日志:

Build Log:

CPU Group: [ 35  10  39  7  29  19  47  37  5  27  17  45  20  3  25  15  43  33  1  23  13  41  31  21  34  8  38  6  28  18  46  36  4  26  16  44  11  2  24  14  42  32  0  22  12  40  9  30 ], 1000000 - 3200000
The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 4/4 [01:03<00:00, 15.83s/it]
export start ...
/home/junfeng.chen/.cache/huggingface/modules/transformers_modules/qwen2-7b-instruct/modeling_qwen2.py:306: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  cos, sin = rotary_pos_emb
/home/junfeng.chen/.cache/huggingface/modules/transformers_modules/qwen2-7b-instruct/modeling_qwen2.py:323: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz, self.num_heads, q_len, kv_seq_len):
/home/junfeng.chen/.cache/huggingface/modules/transformers_modules/qwen2-7b-instruct/modeling_qwen2.py:330: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
/home/junfeng.chen/.cache/huggingface/modules/transformers_modules/qwen2-7b-instruct/modeling_qwen2.py:342: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz, self.num_heads, q_len, self.head_dim):
export done!
Killed

alibaba / MNN

qwen2-7b执行llm_export.py被killed，0.5b和1.5b可以正常导出 #3013

平台(如果交叉编译请再附上交叉编译目标平台): ubuntu 配置：128核132G内存服务器

Platform(Include target platform as well if cross-compiling):

Github版本: a980dba3963efb0ad76b0f3caaf5c21556f69ffe

Github Version:

编译方式:

Compiling Method

编译日志:

Build Log: