NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.16k stars 902 forks source link

convert qwen 110b gptq checkpoint的时候,qkv_bias 的shape不能被3整除 #1589

Closed CallmeZhangChenchen closed 3 months ago

CallmeZhangChenchen commented 4 months ago

094c99ee1cd6bcfd56a550c1a68d80c2

jershi425 commented 4 months ago

@CallmeZhangChenchen Thank you for the feedback. We have not supported Qwen-110B yet. We will support and validate it soon.

Tlntin commented 3 months ago

i got same error in qwen1.5-32b, i think may caused by GQA(group query attention). I fix up it in this, pr link

nv-guomingz commented 3 months ago

This week's update will contain the fixing submmited by @Tlntin .

Hukongtao commented 2 months ago

The model can be converted and built into the engine normally, but the inference results are garbled. Have you ever encountered this? image

Tlntin commented 2 months ago

The model can be converted and built into the engine normally, but the inference results are garbled. Have you ever encountered this? image

i think you need update auto-gptq and transformers to latest.

Hukongtao commented 2 months ago

The model can be converted and built into the engine normally, but the inference results are garbled. Have you ever encountered this? image

i think you need update auto-gptq and transformers to latest.

image still have the problem

Tlntin commented 2 months ago

The model can be converted and built into the engine normally, but the inference results are garbled. Have you ever encountered this? image

i think you need update auto-gptq and transformers to latest.

image still have the problem

after update, have you build engine again?

Hukongtao commented 2 months ago

The model can be converted and built into the engine normally, but the inference results are garbled. Have you ever encountered this? image

i think you need update auto-gptq and transformers to latest.

image still have the problem

after update, have you build engine again?

Yes, I re-update rank0.safetensors and rank0.engine image

Hukongtao commented 2 months ago

The v0.10.0 version may have some problems. I used the latest code from the main branch and it was aligned.

tianzuishiwo commented 2 weeks ago

auto-gpt

I meet the same problems,do you solve the problem? 兄弟!

tianzuishiwo commented 2 weeks ago

The model can be converted and built into the engine normally, but the inference results are garbled. Have you ever encountered this? image

I meet the same problems,do you solve the problem? 兄弟!