Can't change embedding model

mohammad-yousuf commented 9 months ago

Hi,

I am trying to change the embedding model/specify the model in config.yaml file but getting HF authentication error. Can you please help.

embedding_model: type: sentence_transformer model_name: "infloat/e5-large-v2"

snexus commented 9 months ago

Hi,

You have a typo in the model name (missed letter t), thus it can't find it. It should be "intfloat/e5-large-v2"

mohammad-yousuf commented 9 months ago

Oh okay, thank you so much @snexus. I have one more question if you could please help. I have a a dataset of regulations, around 60 PDFs and all of them contain 100s of pages. The PDFs also have images (which we don't need) and also data like charts and tables along side a lot of text.

My approach so far:

LLM: mixtral-8x7b-instruct-v0.1
Embedding model: BAAI/bge-large-en-v1.5
LLamaIndex (metadata extractor: tomaarsen/span-marker-mbert-base-multinerd, title extraction, Pinecone hybrid search)

I can't seem to find good accuracy. I am thinking of building Knowledge graph but the process is too slow given my dataset. Can you give me some pointers or leads on how should I preprocess data and use your tool and additional techniques if any. It will be very helpful.

Thanks in advance.

snexus / llm-search

Can't change embedding model #92