issues
search
janhq
/
cortex.tensorrt-llm
Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
https://cortex.jan.ai/docs/cortex-tensorrt-llm
Apache License 2.0
40
stars
2
forks
source link
Makefile and CICD for cpp tensorrt-llm
#39
Closed
hiento09
closed
5 months ago
hiento09
commented
5 months ago
Makefile and CICD for cpp tensorrt-llm
Makefile and CICD for cpp tensorrt-llm