replicate / cog-triton

A cog implementation of Nvidia's Triton server
Apache License 2.0
11 stars 0 forks source link

Backport some changes from trtllm-0.9 branch #35

Closed yorickvP closed 3 months ago

yorickvP commented 3 months ago

33 contains some changes that make life better and has some further size optimizations.