is it possible to use a previously downloaded HF .gguf file

PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.

Apache License 2.0

20.01k stars 2.23k forks source link

is it possible to use a previously downloaded HF .gguf file #730

Closed cleesmith closed 6 months ago

cleesmith commented 8 months ago

First, this app works great on a MacBook Pro M3 Max 128GB and for lots of transformers and LLM models. One of the few RAG app's where I have been able to run it without the internet (well, once all of the models are downloaded), and using the terminal command "sudo pumas run" I can see it using 100% GPU (mps) during queries. So thank you so much, and for your videos on YouTube.

Since I seem to be trying new RAG or fine-tuning app's so often, I have a lot of existing GGUF files from Hugging Face previously downloaded. Is there a way I/you/us can change this app to use any of those previous ".gguf" downloads. As it is time consuming to download the same stuff over and over again. I did notice that the "models" folder has file types other than just a ".gguf" file ... is there a way to convert previously downloaded gguf into the layout used in your "models" folder.

Please let me know and thanks again for this repo.

PromtEngineer commented 8 months ago

Thank you and glad you are finding this useful. I am not sure, in the snapshots folder under every model that is downloaded, there is the main gguf file. The code is using llama-cpp-python (python binding) to download the file. This might be doing the conversion under the hood. Will need to look into that.

VerdonTrigance commented 7 months ago

@cleesmith you may look at https://huggingface.co/docs/huggingface_hub/guides/manage-cache and setting HF_HOME environment variable. I personally did it and all my HF models are downloading there. But on windows it will keep warming you about symlinks and some other stuff. Anyway you may try it. You can also download huggingface-cli and manage downloads and cache from it.

NitkarshChourasia commented 6 months ago

@VerdonTrigance There is a PR done for using symlinks without any errors or bugs being shown. You can look into it. The title of the PR has symlink in it. Thank you.

randoentity commented 5 months ago

I'm not sure if there is a better way but the only PR with symlink in the name I found was about ingesting documents, not reusing previously downloaded models. Here's how to do it for anyone still looking:

Example with TheBloke/Phind-CodeLlama-34B-v2-GGUF phind-codellama-34b-v2.Q6_K.gguf:

set MODEL_PATH in constants.py
get the latest commit hash from huggingface (here it is da37c48be3b0c6cd487fe05259521dc2824f5a5f)
mkdir --parents $MODEL_PATH/models--TheBloke--Phind-CodeLlama-34B-v2-GGUF/snapshots/da37c48be3b0c6cd487fe05259521dc2824f5a5f
put (or link) your gguf there

MoSedky commented 2 months ago

@randoentity I have tried to do the mentioned steps , but it seems it keeps looking for the file in Repo. Are there any alternative way ?

randoentity commented 2 months ago

@MoSedky I haven't used this in a while, but it looks like I wrote it down incorrectly. It should be MODELS_PATH: https://github.com/PromtEngineer/localGPT/blob/a1dea3becb8b1ae28a87369b1636c4c4a4501c27/constants.py#L20