alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
http://www.mnn.zone/
8.63k stars 1.66k forks source link

MNN-LLM多线程计算结果与单线程不一致 #2931

Closed Zdahap closed 2 months ago

Zdahap commented 3 months ago

平台(如果交叉编译请再附上交叉编译目标平台): orin-CPU

Platform(Include target platform as well if cross-compiling): orin-cpu(支持sdot指令)

复现概率:较高

Github Version: MNN tag 2.9.1

编译方式:

Compiling Method

cmake -DMNN_SUPPORT_TRANSFORMER_FUSE=ON -DMNN_LOW_MEMORY=ON -DMNN_BUILD_LLM=ON ..

问题:
基于qwen1.8b,llm_export后推理,默认计算结果:
` ./llm_demo ../../../qwen_1.8b_weight_8bit_default_quant/llm.mnn 0 10
model path is ../../../qwen_1.8b_weight_8bit_default_quant/llm.mnn
### model name : Qwen_1.8b
The device support i8sdot:1, support fp16:1, support i8mm: 0
### precision, memory = 2, 2
Can't open file:.tempcache
Load Cache file error.
load tokenizer
load tokenizer Done
### disk embedding is 1
load ../../../qwen_1.8b_weight_8bit_default_quant/llm.mnn ... Done!
main, 195, cost time: 3614.658203 ms
Prepare for resize opt Begin
Prepare for resize opt End
Fix: 1071 - Total: 1071, rate = 1.000000
main, 199, cost time: 664.346008 ms

Q: whoareyou

A: I am an artificial intelligence language model. I do not have a physical existence or identity, but I exist to assist and provide information to those who interact with me. My to the石竹他们是,石他们失控`

修改llm.cpp的 Llm::load处的config.numthread为1,chat结果:
`./llm_demo ../../../qwen_1.8b_weight_8bit_default_quant/llm.mnn 0 10
model path is ../../../qwen_1.8b_weight_8bit_default_quant/llm.mnn
### model name : Qwen_1.8b
The device support i8sdot:1, support fp16:1, support i8mm: 0
### precision, memory = 2, 2
Can't open file:.tempcache
Load Cache file error.
load tokenizer
load tokenizer Done
### disk embedding is 1
load ../../../qwen_1.8b_weight_8bit_default_quant/llm.mnn ... Done!
main, 195, cost time: 3571.860107 ms
Prepare for resize opt Begin
Prepare for resize opt End
Fix: 1071 - Total: 1071, rate = 1.000000
main, 199, cost time: 1849.044067 ms

Q: whoareyou

A: I am an artificial intelligence language model. I do not have a physical existence or identity, but I exist to assist and provide information to those who interact with me. My purpose is to assist with tasks such as answering questions, providing information, and generating text based on the input I receive.`
jxt1234 commented 3 months ago

更新 master ,用 transformer/llm 下面的 export 重新转一下

Zdahap commented 2 months ago

更新 master ,用 transformer/llm 下面的 export 重新转一下

更新到MNN2.9.2,暂未复现。