issues
search
arcee-ai
/
fastmlx
FastMLX is a high performance production ready API to host MLX models.
Other
222
stars
25
forks
source link
Add support for token streaming, parallel jobs and custom CORS
#4
Closed
Blaizzy
closed
4 months ago
Blaizzy
commented
4 months ago
This PR adds:
Multi-modal token streaming.
Support for Parallel calls (single and multiple models) by default upto N workers.
Supported model type endpoint.
Delete model endpoint.
Custome CORS.
Todo:
[ ] Refactor stream_generate after mlx-vlm's next release
Closes #2, Closes #5
This PR adds:
Todo:
Closes #2, Closes #5