NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
7.5k stars 815 forks source link

When will support qwen1.5 #1208

Open mogoxx opened 4 months ago

mogoxx commented 4 months ago

python build.py --hf_model_dir /app/model/Qwen1.5-14B-Chat \ --dtype float16 \ --remove_input_padding \ --use_gemm_plugin float16 \ --use_gpt_attention_plugin float16 \ --use_inflight_batching \ --max_batch_size 2 \ --max_input_len 2048 \ --max_output_len 2048 \ --output_dir /app/model/trt_engines/fp16/1-gpu

Traceback (most recent call last): File "/app/tensorrt_llm/examples/qwen/build.py", line 609, in args = parse_arguments() File "/app/tensorrt_llm/examples/qwen/build.py", line 356, in parse_arguments hf_config = AutoConfig.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 734, in getitem raise KeyError(key) KeyError: 'qwen2'

whk6688 commented 4 months ago

+1

whk6688 commented 4 months ago

AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'

ArlanCooper commented 4 months ago

+1, waiting for

litaotju commented 3 months ago

@mogoxx I think the crashing stack

File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 734, in getitem
raise KeyError(key)

is from the transformers lib. Could you try to find out if the latest transformers lib support Qwen2?

pfk-beta commented 3 months ago

@litaotju Stactrace ends in transformers, but begins in convert. So problem might be in both places. In my case transformers==4.39.1 and TensorRT-LLM version: 0.9.0.dev2024031900 gives me the same error:

  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1421, in <module>
    main()
  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1216, in main
    args.rms_norm_eps = hf_config.layer_norm_epsilon
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 263, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'
bluejadezhou commented 3 months ago

+1

shiqingzhangCSU commented 3 months ago

https://github.com/Tlntin/Qwen-TensorRT-LLM It seems implement qwen2。

Franc-Z commented 3 months ago

For TRTLLM-0.9.0, you can refer to https://github.com/Franc-Z/QWen1.5_TensorRT-LLM

bao21987 commented 3 months ago

@Franc-Z your repo is not public accessible

pfk-beta commented 3 months ago

@bao21987 repo was public yesterday...

Franc-Z commented 3 months ago

@bao21987 Confirm, and you can access.

ArlanCooper commented 3 months ago

@bao21987 Confirm, and you can access.

can you give me the access to visit it? thanks

deutschthomas commented 3 months ago

For TRTLLM-0.9.0, you can refer to https://github.com/Franc-Z/QWen1.5_TensorRT-LLM

I also need the access

Franc-Z commented 3 months ago

It's reopen now @pfk-beta @ArlanCooper @deutschthomas

Liu-Da commented 1 month ago

@litaotju Stactrace ends in transformers, but begins in convert. So problem might be in both places. In my case transformers==4.39.1 and TensorRT-LLM version: 0.9.0.dev2024031900 gives me the same error:

  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1421, in <module>
    main()
  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1216, in main
    args.rms_norm_eps = hf_config.layer_norm_epsilon
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 263, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'

+1

riverind commented 1 month ago

https://github.com/Tlntin/Qwen-TensorRT-LLM It seems implement qwen2。

截屏2024-06-13 11 21 48

from the document, it seems actually qwen1.5 ref, https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/qwen/README.md