NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.15k stars 897 forks source link

AttributeError: 'PluginConfig' object has no attribute '_remove_input_padding'. Did you mean: '_remove_input_padding'? #2045

Open xinliu9451 opened 1 month ago

xinliu9451 commented 1 month ago

System Info

A10 tensorrt-cu12-10.2.0.post1 tensorrt-cu12-bindings-10.2.0.post1 tensorrt-cu12-libs-10.2.0.post1 tensorrt_llm-0.12.0.dev2024072300 python==3.10

Who can help?

@Tracin

Information

Tasks

Reproduction

(distil-asr) root@ip-172-31-17-245:~/TensorRT-LLM/examples/whisper# trtllm-build --checkpoint_dir distil_whisper_large_v3_weights_int8/encoder --output_dir distil_whisper_large_v3_int8/encoder --paged_kv_cache disable --moe_plugin disable --enable_xqa disable --max_batch_size 8 --gemm_plugin disable --bert_attention_plugin float16 --remove_input_padding disable --max_input_len 1500

Expected behavior

I'm using distill-large-v3 and I get an error with trtllm-build

actual behavior

[TensorRT-LLM] TensorRT-LLM version: 0.12.0.dev2024072300 [07/29/2024-15:10:41] [TRT-LLM] [W] Implicitly setting PretrainedConfig.n_mels = 128 [07/29/2024-15:10:41] [TRT-LLM] [W] Implicitly setting PretrainedConfig.n_audio_ctx = 1500 [07/29/2024-15:10:41] [TRT-LLM] [W] Implicitly setting PretrainedConfig.num_languages = 100 [07/29/2024-15:10:41] [TRT-LLM] [I] max_seq_len is not specified, using value 2048 Traceback (most recent call last): File "/opt/conda/envs/distil-asr/bin/trtllm-build", line 8, in sys.exit(main()) File "/opt/conda/envs/distil-asr/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 488, in main build_config = BuildConfig.from_dict( File "/opt/conda/envs/distil-asr/lib/python3.10/site-packages/tensorrt_llm/builder.py", line 562, in from_dict return cls( File "", line 33, in init File "/opt/conda/envs/distil-asr/lib/python3.10/site-packages/tensorrt_llm/builder.py", line 503, in __post_init__ remove_input_padding=self.plugin_config.remove_input_padding, File "/opt/conda/envs/distil-asr/lib/python3.10/site-packages/tensorrt_llm/plugin/plugin.py", line 79, in prop field_value = getattr(self, storage_name) AttributeError: 'PluginConfig' object has no attribute '_remove_input_padding'. Did you mean: '_remove_input_padding'?

additional notes

Suspect a problem with the version

tjongsma commented 1 month ago

I've got exactly the same problem after running the whisper example using large v3, windows 10, tensorrt-llm 11.0 on a fresh virtualenv.

Kefeng-Duan commented 3 weeks ago

why do we need to set 'remove_input_padding disable' ?

tjongsma commented 3 weeks ago

why do we need to set 'remove_input_padding disable' ?

I'm guessing we don't, I'm just following the example. But I've tried for example just manually setting input_padding and removing it from the PluginConfig, but then you get errors for every other parameter that is in PluginConfig. It's like a game of whack-a-mole where every time you fix one you get the next. Haven't figured out an overall fix, but I suspect that even if I do there'll be something else that breaks because it does seem like a versioning issue.

yuekaizhang commented 3 weeks ago

@xinliu9451 @tjongsma @Kefeng-Duan, for distill-whisper, would you mind adding model=model.half() here https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/whisper/distil_whisper/convert_from_distil_whisper.py#L60 for now?

The code fix will be synced to github later.

yuekaizhang commented 3 weeks ago

why do we need to set 'remove_input_padding disable' ?

Sorry, remove_input_padding option for distill-whisper would be support in the future.

yuekaizhang commented 2 weeks ago

See https://github.com/NVIDIA/TensorRT-LLM/issues/2118#issuecomment-2292413603 also.