arcee-ai / fastmlx

FastMLX is a high performance production ready API to host MLX models.
Other
222 stars 25 forks source link

Add support for token streaming, parallel jobs and custom CORS #4

Closed Blaizzy closed 4 months ago

Blaizzy commented 4 months ago

This PR adds:

  1. Multi-modal token streaming.
  2. Support for Parallel calls (single and multiple models) by default upto N workers.
  3. Supported model type endpoint.
  4. Delete model endpoint.
  5. Custome CORS.

Todo:

Closes #2, Closes #5