triton-inference-server / triton_cli

Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.
48 stars 2 forks source link

Fix: Upgrading Triton-CLI support for TensorRT-LLM v0.10.0 #75

Closed KrishnanPrash closed 3 months ago

KrishnanPrash commented 3 months ago

Upgrading convert_checkpoint.py scripts for gpt2, llama, and opt to support tensorrt_llm v0.10.0.

Source for code changes: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.10.0/examples