TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
System Info
Who can help?
@byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Building steps:
My CMakeLists.txt for testing the library:
succeeded code:
failed code:
Expected behavior
Expect the compilation of main.cpp to finish without any error when using
tensorrt_llm/executor/executor.h
classes and structures.actual behavior
received
unresolved external symbol
errors when using anytensorrt_llm/executor/executor.h
, but run without any problem on any other.additional notes
Do tensorrt_llm::executor on windows using C++ runtime is not supported on version 0.10.0? Does the newer version support it?