Closed settur1409 closed 8 months ago
the embedding model supports only 384 dim, not sure from where 1536 is coming into picture.
I guess it comes from
ch_engine = index.as_chat_engine(llm=llm, chat_mode='openai')
openai
uses 1536 dimensions embeddings.
Hey @agourlay, I didn't used openAI embeddings. I tried other alternative,
q_engine = index.as_query_engine() q_engine.query(query)
this also giving same error. I don't have any llm here. let me know your inputs.
I believe this is a fastembed specific issue so I took the liberty to transfer it directly on the repository.
dear @NirantK, any idea what it is going on there?
@agourlay @settur1409 will reproduce and get back on findings — in the meantime @Anush008 maintains the Llama Index bindings for all Qdrant packages including FastEmbed
@settur1409, could you share your entire code. Would need that to reproduce.
Maybe as a Gist or file?
qdrant_main_file.txt --> I compiled my code into single file. Please check. Below is the log that I got,
Fetching 7 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<?, ?it/s] 8 Parsing nodes: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 131/131 [00:00<00:00, 131.09it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 177/177 [01:40<00:00, 1.76it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 177/177 [01:40<00:00, 1.76it/s] Generating embeddings: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 177/177 [01:03<00:00, 2.79it/s] 9
this is the error I got. qdrant_client.http.exceptions.UnexpectedResponse: Unexpected Response: 400 (Bad Request) Raw response content: b'{"status":{"error":"Wrong input: Vector inserting error: expected dim: 384, got 1536"},"time":0.001277349}'
let me know if you need some other information.
from the UI,
Attached few snashots from qdrant GUI.
@settur1409, what version of LLamaIndex are you on? There have been quite a lot changes in the recent versions.
Below are the list of llama-index related packages I see in my env
llama-index 0.10.16 llama-index-agent-openai 0.1.5 llama-index-cli 0.1.7 llama-index-core 0.10.16.post1 llama-index-embeddings-fastembed 0.1.4 llama-index-embeddings-huggingface 0.1.4 llama-index-embeddings-openai 0.1.6 llama-index-indices-managed-llama-cloud 0.1.3 llama-index-legacy 0.9.48 llama-index-llms-anyscale 0.1.3 llama-index-llms-langchain 0.1.3 llama-index-llms-openai 0.1.7 llama-index-multi-modal-llms-openai 0.1.4 llama-index-program-openai 0.1.4 llama-index-question-gen-openai 0.1.3 llama-index-readers-file 0.1.8 llama-index-readers-llama-parse 0.1.3 llama-index-vector-stores-chroma 0.1.5 llama-index-vector-stores-qdrant 0.1.4 llama-parse 0.3.6 llamaindex-py-client 0.1.13
qdrant-client version: qdrant-client 1.8.0
@settur1409, you have to move the following lines to the top(below the imports).
from llama_index.core import Settings
embed_model = FastEmbedEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
Settings.embed_model = embed_model
Thank you @Anush008. That fixed the problem.
Hello Guys @Anush008 . i am getting the same issue but i am using the nvidiaEmbedding what i did i created the embedding of pdf document using the bge-small model and then i tried to use that embedding in my rag but i am getting the same error Unexpected Response: 400 (Bad Request) Raw response content: b'{"status":{"error":"Wrong input: Vector dimension error: expected dim: 384, got 1024"},"time":0.00041412}
here is my code import os import time import logging from telegram import Update from telegram.ext import ApplicationBuilder, CommandHandler, MessageHandler, filters, ContextTypes from qdrant_client import QdrantClient from langchain_qdrant import Qdrant from dotenv import load_dotenv from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings, ChatNVIDIA
from langchain.chains.combine_documents import create_stuff_documents_chain from langchain_core.prompts import ChatPromptTemplate from langchain.chains import create_retrieval_chain
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.INFO) logger = logging.getLogger(name)
load_dotenv()
nvidia_api_key = os.getenv('NVIDIA_API_KEY') telegram_token = os.getenv('TELEGRAM_BOT_TOKEN')
logger.info(f"NVIDIA_API_KEY: {nvidia_api_key}") logger.info(f"TELEGRAM_BOT_TOKEN: {telegram_token}")
os.environ['NVIDIA_API_KEY'] = nvidia_api_key
llm = ChatNVIDIA(model="meta/llama3-70b-instruct") # Nvidia Inference
embeddings = NVIDIAEmbeddings()
url = "http://ec2-13-53-193-62.eu-north-1.compute.amazonaws.com:6333" client = QdrantClient(url=url, prefer_grpc=False) vectors = Qdrant(client=client, embeddings=embeddings, collection_name="grade_9")
prompt_template = ChatPromptTemplate.from_template( """ Answer the question based on the provided context only. Please provide the most accurate response based on the question.
Try deleting the "grade_9"
collection.
Langchain will auto-create it for you when you run methods like
from_documets()
and from_texts()
with the appropriate dimensions.
Current Behavior
Getting qdrant_client.http.exceptions.UnexpectedResponse: Unexpected Response: 400 (Bad Request) when performing retriever.retrieve(query)
Steps to Reproduce
using FastEmbedEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2") from from llama_index.embeddings.fastembed (https://qdrant.github.io/fastembed/examples/Supported_Models/) Settings.embed_model = embed_model
self.client = QdrantClient(url=url, api_key=qdrant_api_key)
ch_engine = index.as_chat_engine(llm=llm, chat_mode='openai') retriever = index.as_retriever()
retriever.retrieve(query) --> sending query as str.
I am getting below error, qdrant_client.http.exceptions.UnexpectedResponse: Unexpected Response: 400 (Bad Request) Raw response content: b'{"status":{"error":"Wrong input: Vector inserting error: expected dim: 384, got 1536"},"time":0.00039087}'
the embedding model supports only 384 dim, not sure from where 1536 is coming into picture.
Expected Behavior
Embedding seems not applied to query and I see there is no way to input embedding algo while creating index
Possible Solution
Context (Environment)
Detailed Description
Possible Implementation