InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
23 stars 10 forks source link

Stop sharing model weights across Pods in the same node #102

Closed kerthcet closed 2 months ago

kerthcet commented 2 months ago

What this PR does / why we need it

Stop sharing models in the same node because right now, we have no model weights lifecycle management policies, which may lead to the model size growing all the time.

This can be handled by fluid somehow.

Which issue(s) this PR fixes

Fixes https://github.com/InftyAI/llmaz/issues/88

Special notes for your reviewer

Does this PR introduce a user-facing change?

None
kerthcet commented 2 months ago

/kind bug /lgtm /approve