Open erojaso opened 12 hours ago
Hi @erojaso , why do you explicitly convert the 768 vec to 1024 by embedding = embedding_pre + [0.0] * (1024 - len(embedding_pre))
? Please remove this conversion and the code should work.
Please make sure your dataprep and retriever both access the same TEI Endpoint/Local model. If you have a bge-base (dim:768) for one and a bge-large(dim: 1024) for the other, you are likely to have these dimension mismatch issues.
Priority
Undecided
OS type
Ubuntu
Hardware type
CPU-other (Please let us know in description)
Installation method
Deploy method
Running nodes
Single Node
What's the version?
docker pull opea/dataprep-pgvector:latest docker pull opea/retriever-pgvector:latest
Description
After enabling my services as containers in Kubernetes, I have performed some data ingestion tests for PosgresSQL with the Langchain framework
When performing the tests for the retriever-pgvector I encounter the error: sqlalchemy.exc.DataError: (psycopg2.errors.DataException) different vector dimensions 1024 and 768
Although I use the default model "BAAI/bge-base-en-v1.5", the vector in DB has a different size.
Reproduce steps
With this code I was able to check the error
Raw log