InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
13 stars 5 forks source link

Downsize the model-loader image #93

Open kerthcet opened 3 weeks ago

kerthcet commented 3 weeks ago

What would you like to be added:

Currently, the model-runner is about 56MB, however, the model-loader is about 466MB, we should try to smaller the size.

Why is this needed:

Fast startup.

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

kerthcet commented 3 weeks ago

/kind feature /help