Open ericcurtin opened 1 month ago
From llamafile:
When you download a new model with ollama, all its metadata will be stored in a manifest file under ~/.ollama/models/manifests/registry.ollama.ai/library/. The directory and manifest file name are the model name as returned by ollama list. For instance, for llama3:latest the manifest file will be named .ollama/models/manifests/registry.ollama.ai/library/llama3/latest.
The manifest maps each file related to the model (e.g. GGUF weights, license, prompt template, etc) to a sha256 digest. The digest corresponding to the element whose mediaType is application/vnd.ollama.image.model is the one referring to the model's GGUF file.
Each sha256 digest is also used as a filename in the ~/.ollama/models/blobs directory (if you look into that directory you'll see only those sha256-* filenames). This means you can directly run llamafile by passing the sha256 digest as the model filename. So if e.g. the llama3:latest GGUF file digest is sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29, you can run llamafile as follows:
Can you explain what the advantages are over cloning from HF directly? If you don't want to deal with git lfs you could also use aria2c or https://github.com/oobabooga/text-generation-webui/blob/main/download-model.py These last two both feature resuming downloads and verification.
I think for simple usages of LLMs it's ideal... Like:
ollama pull mistral
is one of the reasons, ollama is so popular. It doesn't get easier than typing one word right? And the popularity of the ollama registry speaks for itself.
I think supporting both huggingface and Ollama is most ideal.
I noticed today, local-ai has it:
Is your feature request related to a problem? Please describe. Ollama has an easy to use repo, you can pull via short and simple strings.
Describe the solution you'd like Here is a script that can pull from Ollama repo:
Integrate something similar and make it useable
Describe alternatives you've considered
Additional context