vectorch-ai / ScaleLLM

A high-performance inference system for large language models, designed for production environments.
https://docs.vectorch.com/
Apache License 2.0
377 stars 28 forks source link

fix multiple devices cuda graph capture issue #248

Closed guocuimi closed 3 months ago

guocuimi commented 3 months ago

fix issue https://github.com/vectorch-ai/ScaleLLM/issues/131