NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.71k stars 996 forks source link

trtllm-build ignores `--model_cls_file` and `--model_cls_name` #2430

Open abhishekudupa opened 1 week ago

abhishekudupa commented 1 week ago

The trtllm-build entrypoint ignores the user-provided --model_cls_file and --model_cls_name. It looks like although the model class is resolved and imported, an appropriate entry in the MODEL_MAP isn't made, as seen here.

I've verified that adding the line MODEL_MAP[args.model_cls_name] = model_cls immediately after the line referenced above works well to fix this issue.

Who can help?

@ncomly-nvidia , @byshiue

Information

Tasks

Reproduction

Try a trtllm-build command with any model architecture that is not supported by default, by using the --model_cls_file and --model_cls_name, you'll end up with a KeyError.

Expected behavior

The trtllm-build picks up my model classes and runs to completion and produces the engine files.

actual behavior

Raises a KeyError

additional notes

None, fix proposed in problem description.