llmware-ai / llmware

Unified framework for building enterprise RAG pipelines with small, specialized models
https://llmware-ai.github.io/llmware/
Apache License 2.0
4.39k stars 812 forks source link

Setting Model Repo Path #932

Open JBatUN opened 1 month ago

JBatUN commented 1 month ago

When setting the model repo path as indicated in the documentation as such:

LLMWareConfig().set_home(".\llm_models\")

And confirming that it has worked by:

LLMWareConfig().get_model_repo_path()

When using ModelCatalog().get_llm_toolkit() the models download to the correct location.

But when using ModelCatalog().load_model("llmware/bling-phi-3-gguf") they are loaded to the cache directory. (.cache\huggingface\hub)

Because I want to package a RAG / PySide6 application, I would like to load local gguf models from within the app directory structure.

How can I force model downloads to the set home path?

Thank you!

doberst commented 1 month ago

@JBatUN- thanks - this is a great question. For GGUF models from HF, llmware pulls a snapshot of the HF model repo using the huggingface_hub api, and then places the gguf files in the llmware model repo path - and deletes the symlinks that HF puts there - and then the only place that .load_model will look is the llmware model repo path - so the .gguf file in the model repo path is what is getting loaded/executed. However, HF manages its own records in the .cache path, and it does create a placeholder entry, even for GGUF models. Try deleting that HF .cache entry - it should not effect the use of the model. (Please note that the situation is different for Pytorch models, where llmware by default will use the HF .from_pretrained() - and the files will be kept in the HF .cache. There are workarounds for this Pytorch model loading, but we have not built into llmware at this point). Please let me know if this answers your question.