No module named 'utils.utils' When try to run QwenVL

System Info

CPU architecture : x86_64 GPU name : NVIDIA V10 32G

Who can help?

No response

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

branch : main b57221b764bc579cbb2490154916a871f620e2c4

1.build trt_llm on the main branch 2.pip install ./build/tensorrt_llm-0.8.0.dev20240123-cp310-cp310-linux_x86_64.whl 3.start run the QwenVL example 3.1 Download Qwen-VL

git lfs install
git clone https://huggingface.co/Qwen/Qwen-VL-Chat

3.2 ViT

python3 vit_onnx_trt.py --pretrained_model_path ./Qwen-VL-Chat

3.2 Qwen Quantize the weights to INT4 with GPTQ

pip install auto-gptq
python3 gptq_convert.py --hf_model_dir ./Qwen-VL-Chat --tokenizer_dir ./Qwen-VL-Chat \
        --quant_ckpt_path ./Qwen-VL-Chat-4bit

Expected behavior

I expected that the gptq_convert.py can convert successfully, but it not.

actual behavior

The result can be as follow:

root@bbc1235:~/TensorRT-LLM/examples/qwenvl# python3 gptq_convert.py --hf_model_dir ./Qwen-VL-Chat --tokenizer_dir ./Qwen-VL-Chat  --quant_ckpt_path ./Qwen-VL-Chat-4bit
CUDA extension not installed.
CUDA extension not installed.
[TensorRT-LLM] TensorRT-LLM version: 0.8.0.dev20240123, commit: b57221b764bc579cbb2490154916a871f620e2c4
Traceback (most recent call last):
  File "/root/TensorRT-LLM/examples/qwenvl/gptq_convert.py", line 13, in <module>
    from utils.utils import make_context
ModuleNotFoundError: No module named 'utils.utils'

additional notes

It seems that there is no module called utils.utils. then I pip install utils ,but it not work. Because there is no function called make_context in the module utils.

The result can be as follow:

root@bbc1235:~/TensorRT-LLM/examples/qwenvl# python3 gptq_convert.py --hf_model_dir ./Qwen-VL-Chat --tokenizer_dir ./Qwen-VL-Chat         --quant_ckpt_path ./Qwen-VL-Chat-4bit
CUDA extension not installed.
CUDA extension not installed.
[TensorRT-LLM] TensorRT-LLM version: 0.8.0.dev20240123, commit: b57221b764bc579cbb2490154916a871f620e2c4
Traceback (most recent call last):
  File "/root/TensorRT-LLM/examples/qwenvl/gptq_convert.py", line 13, in <module>
    from utils import make_context
ImportError: cannot import name 'make_context' from 'utils' (/usr/local/lib/python3.10/dist-packages/utils/__init__.py)

I think whether you have a module called utils with the same name of pip package? and you forget to upload ?

NVIDIA / TensorRT-LLM