containers / ramalama

The goal of RamaLama is to make working with AI boring.
MIT License
280 stars 48 forks source link

pulling Granite 3 models #367

Closed tarilabs closed 3 weeks ago

tarilabs commented 1 month ago

Currently, granite is shown shortcode corresponding to a gguf serialization:

https://github.com/containers/ramalama/blob/cd1e7d53b570beb00cb767b97fe14749b3932ac0/README.md?plain=1#L56-L57

What would be the equivalent for Granite 3 series recently released, please?

https://huggingface.co/collections/ibm-granite/granite-30-models-66fdb59bbb54785c3512114f

Can ramalama pull also the HF ModelCard, so to make use of it during push?

ericcurtin commented 1 month ago

I can't seem to find gguf's for those on huggingface, but since they are on ollama, we can just pull via shortnames:

granite3-dense and granite3-moe

already.

ericcurtin commented 1 month ago

Can ramalama pull also the HF ModelCard, so to make use of it during push?

I don't see why not 😄 would merge functionality like this

tarilabs commented 1 month ago

Thanks for the feedback in https://github.com/containers/ramalama/issues/367#issuecomment-2435172868 and https://github.com/containers/ramalama/issues/367#issuecomment-2435173992 !

I can't seem to find gguf's for those on huggingface

So for my understanding: is gguf the supported format, or are there other supported formats? 🤔 (sorry if maybe a banal question 😅 )

ericcurtin commented 1 month ago

Thanks for the feedback in #367 (comment) and #367 (comment) !

I can't seem to find gguf's for those on huggingface

So for my understanding: is gguf the supported format, or are there other supported formats? 🤔 (sorry if maybe a banal question 😅 )

Right now only .gguf works well. We are open to supporting other formats and other runtimes (like llama.cpp and vllm are two ones planned).

As with most feature, it's often comes down to if someone with the time can open a PR!

ericcurtin commented 3 weeks ago

I'm gonna close this because Ollama versions already have shortnames and these aren't available on HF in the correct format which is kinda out of our control