replicate / cog-triton

A cog implementation of Nvidia's Triton server
Apache License 2.0
11 stars 0 forks source link

Joe/build triton main #13

Closed joehoover closed 5 months ago

joehoover commented 5 months ago

This branch was used to develop and refactor for TensorRT-LLM 0.8.0. It also introduces a half-baked benchmark script.