intel / xFasterTransformer

Apache License 2.0
349 stars 60 forks source link

[BUG] PR #224 cause the wrong output of QWEN-14B #239

Closed a3213105 closed 6 months ago

a3213105 commented 6 months ago

After upgrade to the latest PR #224. The QWEN-14B got wrong outputs. the cmd: python demo.py -t /mnt/data/LLM_Models/Qwen-14B-Chat/ -m /mnt/data/LLM_Models/Qwen-14B-Chat/cpu/ --do_sample False --rep_penalty 1.1 --output_len 512 --dtype bf16_fp16

the output before this PR (the same as Torch Outputs): 在正常情况下,大车行走速度分为四挡,分别是10%、20%、50%和80%,具体取决于操作手柄所在挡位。

the output after this PR: 在二战期间,美国政府曾实行配给制度,定量供应食品和燃料油。故填埋场的垃圾处理厂,以及废物回收站。