Closed tarilabs closed 3 weeks ago
I can't seem to find gguf's for those on huggingface, but since they are on ollama, we can just pull via shortnames:
granite3-dense and granite3-moe
already.
Can ramalama pull also the HF ModelCard, so to make use of it during push?
I don't see why not 😄 would merge functionality like this
Thanks for the feedback in https://github.com/containers/ramalama/issues/367#issuecomment-2435172868 and https://github.com/containers/ramalama/issues/367#issuecomment-2435173992 !
I can't seem to find gguf's for those on huggingface
So for my understanding: is gguf the supported format, or are there other supported formats? 🤔 (sorry if maybe a banal question 😅 )
Thanks for the feedback in #367 (comment) and #367 (comment) !
I can't seem to find gguf's for those on huggingface
So for my understanding: is gguf the supported format, or are there other supported formats? 🤔 (sorry if maybe a banal question 😅 )
Right now only .gguf works well. We are open to supporting other formats and other runtimes (like llama.cpp and vllm are two ones planned).
As with most feature, it's often comes down to if someone with the time can open a PR!
I'm gonna close this because Ollama versions already have shortnames and these aren't available on HF in the correct format which is kinda out of our control
Currently, granite is shown shortcode corresponding to a gguf serialization:
https://github.com/containers/ramalama/blob/cd1e7d53b570beb00cb767b97fe14749b3932ac0/README.md?plain=1#L56-L57
What would be the equivalent for Granite 3 series recently released, please?
https://huggingface.co/collections/ibm-granite/granite-30-models-66fdb59bbb54785c3512114f
Can ramalama pull also the HF ModelCard, so to make use of it during push?