qdrant / fastembed

Fast, Accurate, Lightweight Python library to make State of the Art Embedding
https://qdrant.github.io/fastembed/
Apache License 2.0
1.53k stars 110 forks source link

Trying to download model from huggingface manually and want to use it from local path instead of download from HF #229

Closed visheshgitrepo closed 6 months ago

visheshgitrepo commented 6 months ago

Downloading model i.e, model.onnx and tokenizer.json, vocab.txt files from huggingface.

Now I want to pass this local path and dont want fastembed to download from HF. trying below but its not working out but still trying to connect internet.

cache_path = "./embedding_model_s3" this cache_path consists of 3 files mentioned above TextEmbedding(local_files_only=True,model_name="sentence-transformers/all-MiniLM-L6-v2",cache_dir=cache_path)

Is this right way to do

joein commented 6 months ago

Hi @visheshgitrepo, what is exactly the issue?

It checks the revision of the model via internet, then if it exists in the local dir, it read it from there, otherwise it downloads it from hf hub.

Btw, you're passing local_file_only=True but it is not supported yet

visheshgitrepo commented 6 months ago

Im using Fastembed for my application with Guardrails. Where I pass engine and model as "fastembed" and "all-MiniLM-L6-v2", this will call fastembed and passes this model. As it by default downloads from huggingface. I dont want the fastembed to download from huggingface. I just want to pass model path which consists of model files and use it for embeddings. Something like below..

cache_path = "./sentence-transformers/all-MiniLM-L6-v2"
embedding_model = TextEmbedding(model_name=cache_path)

image

joein commented 6 months ago

How did you download your model?

when fastembed downloads a model it saves it under cache_dir with the following dir structure:

<cache_dir>/<models-<repo>-<model-name>/

e.g. if you're using

TextEmbedding(model_name='sentence-transformers/all-MiniLM-L6-v2', cache_dir='./model_cache')

the full path to the model will be

./model_cache/models--qdrant--all-MiniLM-L6-v2-onnx/

jai8004 commented 6 months ago

Hi @joein ,

I am facing the same issue,

My System Setup:: Windows Server from the client side with restriction to hugging face.

I am trying to use hybrid search from fastembed , but due to restricting in the client side to download models directly, I need to download the models (Splade and AllMinilm) and ship them from my local. Now even if I provide the custom path to my cache_dir, it also fails to load the model as it seems like it tries to check the latest version of the model from the net every time.

To fix this in the client network I changed the True here in your code from the Python environment lib folder where qdrant client was installed and it started to work.

local_files_only=kwargs.get("local_files_only", False) - 120 https://github.com/qdrant/fastembed/blob/main/fastembed/common/model_management.py

Please give us the feature to load the embedding model folder path directly from the local, as most of the users will have the restriction for downloading the model every time.

joein commented 6 months ago

Hi @jai8004

local_files_only option is available as of fastembed==0.2.7

It seems that you're already using local_files_only, thus you're on fastembed 0.2.7

However, you don't need to change the default value to false, you just need to initialize your embedding with passing local_files_only as a keyword-argument, e.g. TextEmbedding(local_files_only=True)

joein commented 6 months ago

Closing it as it has already been implemented, feel free to create a discussion or create a new issue if something does not work

datalee commented 6 months ago

mark

praveeniitm commented 5 months ago

visheshgitrepo How did you solve your issue. I am facing same issue while using guardrails.

satyaloka93 commented 3 months ago

This is not working for me, it still attempts to download from huggingface, even with local_files_only=True, and cache_dir set to where the model files are. We have to approve models first, therefore no direct HF downloading, why does it still reach out??

satyaloka93 commented 3 months ago

I think it's because it's not the HF cache format, but the model files themselves, is there no way to load them from a path?