InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
13 stars 5 forks source link

ollama support #91

Open kerthcet opened 3 weeks ago

kerthcet commented 3 weeks ago

What would you like to be added:

ollama provides sdk for integrations, we can easily integrate with it, one of the benefits I can think of is ollama maintains a bunch of quantized models, we can leverage.

Why is this needed:

Ecosystem integration.

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

kerthcet commented 3 weeks ago

/kind feature

kerthcet commented 3 weeks ago

~Because of ollama doesn't provide http servers, one way to integrate with it is to support URI with ollama protocol and inference with llama.cpp~

RE: it supports rest server, see https://github.com/ollama/ollama/blob/main/docs/api.md