Support ollama - Githubissues

InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

Apache License 2.0

30 stars 10 forks source link

Support ollama #193

Closed qinguoyi closed 1 week ago

qinguoyi commented 2 weeks ago

What this PR does / why we need it

Support ollama

Which issue(s) this PR fixes

https://github.com/InftyAI/llmaz/issues/91

Special notes for your reviewer

add OLLAMA protocol, when access modelName, there return model.Address and not inject init container
execute multiple shell commands instead of just ollama run；because needs to start first and make sure the ollama service is started
upgrader parse flags impl, there will be a replacement rather than a complete override
Does this PR introduce a user-facing change?
```
Support ollama
```

qinguoyi commented 2 weeks ago

Generally LGTM, have you tested locally? Maybe we can have a e2e test because ollama can still run with CPUs.

Thanks for your review, I've done multiple tests locally before committing. Another, a complete e2e test is very necessary and I will complete it as soon as possible.

kerthcet commented 2 weeks ago

If you finished the work, feel free to Ping me, not to rush you, just a friendly reminder. 😄

qinguoyi commented 2 weeks ago

If you finished the work, feel free to Ping me, not to rush you, just a friendly reminder. 😄

Thankd for your kind reply. I am sorry for late reply. I haven't finished this yet, still doing e2e unit test work. I will finish this work as soon as possible in this week

qinguoyi commented 1 week ago

If you finished the work, feel free to Ping me, not to rush you, just a friendly reminder. 😄

I have completed the e2e ollama test, please review the code again @kerthcet

qinguoyi commented 1 week ago

/approve

Please squash the commits.

kind ping @kerthcet, I've squash the commits

kerthcet commented 1 week ago

/lgtm Thanks!

kerthcet commented 1 week ago

/kind feature

kerthcet commented 1 week ago

/approve

InftyAI / llmaz

Support ollama #193

What this PR does / why we need it

Which issue(s) this PR fixes

Special notes for your reviewer

Does this PR introduce a user-facing change?