Closed thmsmlr closed 4 months ago
Author of ollama-ex here. I looked into creating an Instructor adapter a couple of weeks ago (using the standard Ollama API). I came to the conclusion that ideally Ollama should provide access to setting grammars (like you do with the llamacpp adaptor), but currently Ollama doesn't allow this.
Would you agree Grammars are needed for this? Or should it be possible to expect an Ollama adapter to work well just by prompting the LLM to conform to a specific json schema?
Yeah, it's my personal belief that all LLM inference servers should support grammar, but in lieu of that, i think JSON mode is good enough, which most do support. Personally I've tested JSON mode and MD_JSON mode in the Python version of instructor and it works quite well on Mixtral and some of the other big models. Given that, I would still like to provide the easy option for people to experiment for themselves
Hi! I’ve been playing quite a lot recently with the llamacpp
adapter (tools mode) versus the OpenAI adapter with Ollama as a backend (json mode) and results are significantly worse with Ollama without grammar param. It would be great to have such an option to try it out
@lilfaf I think there are three PRs adding a grammar option: ollama/ollama#565, ollama/ollama#830 and ollama/ollama#1606... hoping one of these gets merged 🙏🏻
I think your PR that enables a generic JSON mode feels like a good first step until Ollama properly supports grammars.
@lilfaf Ya i've been playing with ollama too, doesn't work well without grammar on the smaller models, but it works decently well with larger models. I spent the morning testing with nous-hermes2-mixtral:8x7b-dpo-q6_K
and its working quite well. I'll keep an eye on the linked PRs and hopefully we can integrate them soon.
As of yesterday, OlLama supports the Open AI API spec.
https://ollama.ai/blog/openai-compatibility
It would be great for us to support this and create documentation to show users how to use it because OLLAMA is much easier to get up and running than Lama.CPP.
In order to do so, however, we need to support more modes than just tools. See The Python Instructor. I have the support for modes roughed out in the code. We just need to create implementations.
JSON Mode is required to support ollama