TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
I am trying to build a custom configuration of BERT for the purpose of performance measuring only. Following the README for GPT, I am using the generate_checkpoint_config.py to generate BERT config with command as follows:
Hi there,
I am trying to build a custom configuration of BERT for the purpose of performance measuring only. Following the README for GPT, I am using the generate_checkpoint_config.py to generate BERT config with command as follows:
Then, I run the trtllm-build command as follows:
It throws an error:
RuntimeError: Unsupported model architecture: BertModel
Looking at init.py, the
MODEL_MAP
does not include BERT.Is it not officially supported? Do you have any suggestion how to proceed?
Thanks!
Below is the package version I am using for reference.