flexflow / FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving
https://flexflow.readthedocs.io
Apache License 2.0
1.6k stars 219 forks source link

Cuda graph Draft #1298

Closed lambda7xx closed 1 month ago

lambda7xx commented 5 months ago

Description of changes:

Related Issues:

Linked Issues:

Issues closed by this PR:


This change is Reviewable