basetenlabs / truss

The simplest way to serve AI/ML models in production
https://truss.baseten.co
MIT License
857 stars 61 forks source link

improves trtllm example #998

Closed rcano-baseten closed 3 weeks ago

rcano-baseten commented 3 weeks ago

:rocket: What

Addresses some shortcomings in the trtllm exmaple

:microscope: Testing

Deployed this on a Llama-3-8b-trt-llm model and ran it through its paces (stream=True, False), additional params to ensure forward compatibility, printed model input from truss to verify functionality

Some Follow Ups:

rcano-baseten commented 3 weeks ago

closing in favor of https://github.com/basetenlabs/truss/pull/1010