lyogavin / airllm

AirLLM 70B inference with single 4GB GPU
Apache License 2.0
5.28k stars 423 forks source link

Integration with ollama server #199

Open drdozer opened 2 weeks ago

drdozer commented 2 weeks ago

Is it possible to integrate this with the ollama model server? I tend to expose LLMs through ollama to various applications that are able to talk to that. But I couldn't see easily how to get ollama to use the airllm version of a model so that I can run the larger models locally and access them through the ollama server. Cheers!