langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.69k stars 13.94k forks source link

BUG in langchain_community.vectorstores.qdrant #23626

Open deepslit opened 1 week ago

deepslit commented 1 week ago

Checked other resources

Example Code

Issue Description

I encountered a problem when using the Qdrant.from_existing_collection method in the Langchain Qdrant integration. Here is the code I used:

from langchain_community.vectorstores.qdrant import Qdrant

url = "http://localhost:6333"
collection_name = "unique_case_2020"

qdrant = Qdrant.from_existing_collection(
    embedding=embeddings,  # Please set according to actual situation
    collection_name=collection_name,
    url=url
)

When I run this code, I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[21], line 7
      3 url = "http://localhost:6333"
      4 collection_name = "unique_case_2020"
----> 7 qdrant = Qdrant.from_existing_collection(
      8     embedding=embeddings,  # Please set according to actual situation
      9     collection_name=collection_name,
     10     url=url
     11 )

TypeError: Qdrant.from_existing_collection() missing 1 required positional argument: 'path'

To resolve this, I added the path argument, but encountered another error:

qdrant = Qdrant.from_existing_collection(
    embedding=embeddings,  # Please set according to actual situation
    collection_name=collection_name,
    url=url,
    path=""
)

This raised the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[23], line 7
      3 url = "http://localhost:6333"
      4 collection_name = "unique_case_2020"
----> 7 qdrant = Qdrant.from_existing_collection(
      8     embedding=embeddings,  # Please set according to actual situation
      9     collection_name=collection_name,
     10     url=url,
     11     path=""
     12 )

File ~/.local/lib/python3.10/site-packages/langchain_community/vectorstores/qdrant.py:1397, in Qdrant.from_existing_collection(cls, embedding, path, collection_name, location, url, port, grpc_port, prefer_grpc, https, api_key, prefix, timeout, host, **kwargs)
   1374 @classmethod
   1375 def from_existing_collection(
   1376     cls: Type[Qdrant],
   (...)
   1390     **kwargs: Any,
   1391 ) -> Qdrant:
   1392     """
   1393     Get instance of an existing Qdrant collection.
   1394     This method will return the instance of the store without inserting any new
   1395     embeddings
   1396     """
-> 1397     client, async_client = cls._generate_clients(
   1398         location=location,
   1399         url=url,
   1400         port=port,
   1401         grpc_port=grpc_port,
   1402         prefer_grpc=prefer_grpc,
   1403         https=https,
   1404         api_key=api_key,
   1405         prefix=prefix,
   1406         timeout=timeout,
   1407         host=host,
   1408         path=path,
   1409         **kwargs,
   1410     )
   1411     return cls(
   1412         client=client,
   1413         async_client=async_client,
   (...)
   1416         **kwargs,
   1417     )

File ~/.local/lib/python3.10/site-packages/langchain_community/vectorstores/qdrant.py:2250, in Qdrant._generate_clients(location, url, port, grpc_port, prefer_grpc, https, api_key, prefix, timeout, host, path, **kwargs)
   2233 @staticmethod
   2234 def _generate_clients(
   2235     location: Optional[str] = None,
   (...)
   2246     **kwargs: Any,
   2247 ) -> Tuple[Any, Any]:
   2248     from qdrant_client import AsyncQdrantClient, QdrantClient
-> 2250     sync_client = QdrantClient(
   2251         location=location,
   2252         url=url,
   2253         port=port,
   2254         grpc_port=grpc_port,
   2255         prefer_grpc=prefer_grpc,
   2256         https=https,
   2257         api_key=api_key,
   2258         prefix=prefix,
   2259         timeout=timeout,
   2260         host=host,
   2261         path=path,
   2262         **kwargs,
   2263     )
   2265     if location == ":memory:" or path is not None:
   2266         # Local Qdrant cannot co-exist with Sync and Async clients
   2267         # We fallback to sync operations in this case
   2268         async_client = None

File ~/.local/lib/python3.10/site-packages/qdrant_client/qdrant_client.py:107, in QdrantClient.__init__(self, location, url, port, grpc_port, prefer_grpc, https, api_key, prefix, timeout, host, path, force_disable_check_same_thread, grpc_options, auth_token_provider, **kwargs)
    104 self._client: QdrantBase
    106 if sum([param is not None for param in (location, url, host, path)]) > 1:
--> 107     raise ValueError(
    108         "Only one of <location>, <url>, <host> or <path> should be specified."
    109     )
    111 if location == ":memory:":
    112     self._client = QdrantLocal(
    113         location=location,
    114         force_disable_check_same_thread=force_disable_check_same_thread,
    115     )

ValueError: Only one of <location>, <url>, <host> or <path> should be specified.

Error Message and Stack Trace (if applicable)

No response

Description

Expected Behavior

The from_existing_collection method should allow the path argument to be optional, as specifying both url and path leads to a conflict, and path should not be mandatory when url is provided.

Actual Behavior

Suggested Fix

Reproduction

  1. Use the provided code to instantiate a Qdrant object from an existing collection.
  2. Observe the TypeError when path is not provided.
  3. Observe the ValueError when path is provided along with url.

Thank you for looking into this issue.

System Info

Environment

Anush008 commented 6 days ago

Hello @deepslit. Please use the new langchain-qdrant that includes a fix for this. The community version is no longer maintained.

pip install langchain-qdrant
from langchain_qdrant import Qdrant