Closed ManuXD32 closed 1 month ago
yes
yes
And is it possible to split them? I've been trying with Mistral Nemo but I get this error all the time:
RUST_BACKTRACE=1 cake-split-model --model-path model/Mistral-Nemo-Instruct-2407-Q5_K_M.gguf --topology topology.yml --output output/
thread 'main' panicked at cake-split-model/src/main.rs:149:40: can't load index: Not a directory (os error 20)
Stack backtrace:
0: RUST_BACKTRACE=full
for a verbose backtrace.
Aborted
I have also tried with other models getting the same error.
You are using a GGUF file, which is not supported by Cake. Only safetensors.
You are using a GGUF file, which is not supported by Cake. Only safetensors.
Are there plans to support gguf format?
possibly at some point, i work on this on my free time so i won't commit to a specific timeline
possibly at some point, i work on this on my free time so i won't commit to a specific timeline
Okay!! Thanks for your effort.
Firs of all, I wanna thank you for your hard work, I love this project and I thinks it's awesome to be able to handle inference on different devices. As for me, the point in splitting a model among different devices, lays in my current RAM limitations, so I guess it would have much more sense to be able to use quantized versions of the big models.