InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
15 stars 6 forks source link

Accelerate model loading #103

Open kerthcet opened 3 weeks ago

kerthcet commented 3 weeks ago

What would you like to be added:

Generally,

However, there are 2 gaps here:

Why is this needed:

Minimum the configurations but still enjoy the accelerating.

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

kerthcet commented 3 weeks ago

/kind feature

kerthcet commented 3 weeks ago

/priority important-soon