milvus-io / milvus-model

The embedding/reranking model zoo help user to convert their unstructured data into embeddings
Apache License 2.0
19 stars 16 forks source link

TypeError in BGEM3FlagModel encode() due to mismatched argument name (queries expected instead of sentences) #43

Open mhamdik opened 1 week ago

mhamdik commented 1 week ago

When using the BGEM3EmbeddingFunction from the milvus_model package, calling the encode_queries or encode_documents method leads to a TypeError. The error occurs because the _encode method in BGEM3EmbeddingFunction uses sentences as the argument name when calling self.model.encode. However, the underlying model, BGEM3FlagModel, expects an argument named queries.

Traceback (most recent call last): File "/path/to/my/code.py", line 51, in query_embeddings = bge_m3.encode_queries(queries) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/path/to/api-venv/lib/python3.12/site-packages/milvus_model/hybrid/bge_m3.py", line 102, in encode_queries return self._encode(queries) ^^^^^^^^^^^^^^^^^^^^^ File "/path/to/api-venv/lib/python3.12/site-packages/milvus_model/hybrid/bge_m3.py", line 83, in _encode output = self.model.encode(sentences=texts, **self._encode_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: M3Embedder.encode() missing 1 required positional argument: 'queries'

Using MacOS, Python 3.12, and the latest version of the milvus_model package from PyPI. This issue may impact other methods within the BGEM3EmbeddingFunction class that depend on self.model.encode.

from pymilvus.model.hybrid import BGEM3EmbeddingFunction

bge_m3 = BGEM3EmbeddingFunction( model_name='BAAI/bge-m3', devices="cpu", use_fp16=False )

queries = [ "What are the effects of climate change?", "¿Cuáles son los efectos del cambio climático?", "气候变化有什么影响?" ]

Encoding the queries

query_embeddings = bge_m3.encode_queries(queries=queries) print("Query Embeddings:", query_embeddings)

Merci

rguilmont commented 1 week ago

Looks like it's due to dependency installation in Milvus.

New version of Flag Embedding ( https://pypi.org/project/FlagEmbedding/1.3.2/ ) has been released some days ago and basically broke everything. Maybe we should pin-point to the working 1.2.X version ( i personnaly prefer installing it via Poetry to avoid this live dependency installation )

rguilmont commented 1 week ago

Created a PR to fix that. Just tested and works fine.

https://github.com/milvus-io/milvus-model/pull/44

wxywb commented 1 week ago

Thanks for @rguilmont 's quick fix, I will adapt to the newer version of FlagEmbedding soon.

mhamdik commented 1 day ago

Thanks for @rguilmont 's