nath1295 / MLX-Textgen

A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.
MIT License
1 stars 0 forks source link

Feature request: tool/function calling #1

Open vlbosch opened 2 hours ago

vlbosch commented 2 hours ago

Recently MLX got support for function calling. The output must still be parsed manually, so it's not OpenAI-compliant. Supporting the OpenAI HTTP server specification could be a feature that sets this project apart from the rest. Like a drop in replacement with local models. Would be happy to help you implement it.

nath1295 commented 2 hours ago

I saw that Outline supports mlx guided decoding, I think the next step is to make it available in this server engine first (like how vllm works), and then function calling should not be too hard to implement. It is safer to do it with guided decoding instead of letting the model generate freely in my opinion. I would be happy to collaborate on that.