Optimized runtime inference

mystic-ai / pipeline

Pipeline is an open source python SDK for building AI/ML workflows

https://www.mystic.ai

Apache License 2.0

117 stars 21 forks source link

Optimized runtime inference #426

Closed jerrymatjila closed 1 month ago

jerrymatjila commented 3 months ago

I'm looking for advice. Based on your experience which engine provides better optimized runtime inference between vllm and TensorRT-LLM or any engine you have encountered for running on NVIDIA GPU.

stale[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.