InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Apache License 2.0
30 stars 10 forks source link

Report filename and file size in modelLoader #122

Closed kerthcet closed 2 months ago

kerthcet commented 2 months ago

What would you like to be cleaned:

Here's the current output of model loader:

Skipping virtualenv creation, as specified in config file.
2024-09-03 14:49:03,464 - __main__ - INFO - loading models from modelhub takes 0:03:15.848986s
Start to download model TheBloke/Llama-2-7B-GGUF
Skipping virtualenv creation, as specified in config file.
2024-09-03 14:45:45,397 - __main__ - INFO - loading models from modelhub takes 0:11:10.482161s
Start to download model TheBloke/Llama-2-7B-GGUF
Error from server (BadRequest): container "model-runner" in pod "llamacpp-speculator-0" is waiting to start: PodInitializing

If we're downloading GGUF files, there's no file name and the file size, we should report that as well.

Why is this needed:

Better observation.

kerthcet commented 2 months ago

/milestone v0.1.0

kerthcet commented 2 months ago

/assign

kerthcet commented 2 months ago

I think file size is not necessary because usually we know the size very well. And calculating the size takes times, if oneday it costs no time like supported by the hub library, we can support this.