The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
7.17k
stars
792
forks
source link
docs: Add explanations for model loading acceleration #5066
Closed
Sherlock113 closed 2 weeks ago
What does this PR address?
Fixes #(issue)
Before submitting:
pre-commit run -a
script has passed (instructions)?