replicate / cog-triton

A cog implementation of Nvidia's Triton server
Apache License 2.0
12 stars 0 forks source link

Backport some changes from trtllm-0.9 branch #35

Closed yorickvP closed 6 months ago

yorickvP commented 7 months ago

33 contains some changes that make life better and has some further size optimizations.