NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.25k stars 913 forks source link

nccl ops from TRT-LLM #2220

Open apbose opened 1 week ago

apbose commented 1 week ago

Hi, I have a use case in which I would like to use nccl ops plugin from TRT-LLM in my project. I see that there is a code snippet in tensorrt_llm/plugin/plugin.py which loads the "libnvinfer_plugin_tensorrt_llm.so" file, and also this shared lib gets created when I do python scripts/build_wheel.py in the tensorrt_llm/lib folder. I was wondering that say if I doimport tensorrt_llm , where should I get access to the shared library? Is it shipped with the releases, so that I can directly do _load_plugin_lib() and use it in my python code?

apbose commented 6 days ago

When I do something like

try:
    ctypes.CDLL("/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so")
    print("plugin loaded sucessfully####")
except OSError as e:
    print(f"unsuccessful load : {e}")
logger = trt.Logger(trt.Logger.VERBOSE)
trt.init_libnvinfer_plugins(logger, '')
plugin_registry = trt.get_plugin_registry()
for plugin_creator in plugin_registry.plugin_creator_list:
    print(f"Plugin Name: {plugin_creator.name}, Namespace: {plugin_creator.plugin_namespace}, Version: {plugin_creator.plugin_version}")

It does not print the TRT-LLM plugins. But nm -D /root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so| grep PluginCreator it shows me the plugin symbols. Why is it not getting loaded then?