Support Apple MLX- LM - Githubissues

Describe the need of your request

You use mlx-lm to make an HTTP API for generating text with any supported model. The HTTP API is intended to be similar to the OpenAI chat API

It is very similar (if not the same as Ollama) however, the code completion bit is not working properly as it seems not to be able to get the stop token right

Proposed solution

No response

Additional context

More info:

https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md

carlrobertoh / CodeGPT

Support Apple MLX- LM #769

Describe the need of your request

Proposed solution

Additional context