InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Apache License 2.0
30 stars 10 forks source link

Support ollama #193

Closed qinguoyi closed 1 week ago

qinguoyi commented 2 weeks ago

What this PR does / why we need it

Support ollama

Which issue(s) this PR fixes

https://github.com/InftyAI/llmaz/issues/91

Special notes for your reviewer

  1. add OLLAMA protocol, when access modelName, there return model.Address and not inject init container
  2. execute multiple shell commands instead of just ollama run;because needs to start first and make sure the ollama service is started
  3. upgrader parse flags impl, there will be a replacement rather than a complete override

    Does this PR introduce a user-facing change?

    Support ollama
qinguoyi commented 2 weeks ago

Generally LGTM, have you tested locally? Maybe we can have a e2e test because ollama can still run with CPUs.

Thanks for your review, I've done multiple tests locally before committing. Another, a complete e2e test is very necessary and I will complete it as soon as possible.

kerthcet commented 2 weeks ago

If you finished the work, feel free to Ping me, not to rush you, just a friendly reminder. 😄

qinguoyi commented 2 weeks ago

If you finished the work, feel free to Ping me, not to rush you, just a friendly reminder. 😄

Thankd for your kind reply. I am sorry for late reply. I haven't finished this yet, still doing e2e unit test work. I will finish this work as soon as possible in this week

qinguoyi commented 1 week ago

If you finished the work, feel free to Ping me, not to rush you, just a friendly reminder. 😄

I have completed the e2e ollama test, please review the code again @kerthcet

qinguoyi commented 1 week ago

/approve

Please squash the commits.

kind ping @kerthcet, I've squash the commits

kerthcet commented 1 week ago

/lgtm Thanks!

kerthcet commented 1 week ago

/kind feature

kerthcet commented 1 week ago

/approve