Documentation? - Githubissues

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Apache License 2.0

8.66k stars 988 forks source link

Greetings, I'd like to ask where is the up to date documentation for TensorRT LLM available?

@juney-nvidia

The links on the official NVidia Website that are meant to point to the Python reference as well as Plugins reference, point to empty pages in the documentation:

https://nvidia.github.io/TensorRT-LLM/python-api/tensorrt_llm.plugin.html https://nvidia.github.io/TensorRT-LLM/python-api/tensorrt_llm.functional.html

I'd like to learn what all plugins are available, what do they do, how to properly use them, and what all parameters and arguments can be used with the tensorrt build command. It doesn't seem to be clearly denoted anywhere, unless I am missing something.

NVIDIA / TensorRT-LLM

Documentation? #1983