aws-samples / foundation-model-benchmarking-tool

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
https://aws-samples.github.io/foundation-model-benchmarking-tool/
MIT No Attribution
202 stars 31 forks source link

Integration triton inference server with djl #204

Closed madhurprash closed 1 month ago

madhurprash commented 1 month ago

This PR contains code refactoring for triton on AWS chips using VLLM and DJL. This is tested on triton (on vllm, djl) and a previous djl file.

To do:

  1. Test for triton using tensorRT