Fix: Upgrading Triton-CLI support for TensorRT-LLM v0.10.0

triton-inference-server / triton_cli

Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.

48 stars 2 forks source link

Closed KrishnanPrash closed 3 months ago

KrishnanPrash commented 3 months ago

Upgrading convert_checkpoint.py scripts for gpt2, llama, and opt to support tensorrt_llm v0.10.0.