Open vlbosch opened 2 hours ago
I saw that Outline supports mlx guided decoding, I think the next step is to make it available in this server engine first (like how vllm works), and then function calling should not be too hard to implement. It is safer to do it with guided decoding instead of letting the model generate freely in my opinion. I would be happy to collaborate on that.
Recently MLX got support for function calling. The output must still be parsed manually, so it's not OpenAI-compliant. Supporting the OpenAI HTTP server specification could be a feature that sets this project apart from the rest. Like a drop in replacement with local models. Would be happy to help you implement it.