Python Client library for the Qdrant vector search engine.
Client library and SDK for the Qdrant vector search engine.
Library contains type definitions for all Qdrant API and allows to make both Sync and Async requests.
Client allows calls for all Qdrant API methods directly. It also provides some additional helper methods for frequently required operations, e.g. initial collection uploading.
See QuickStart for more details!
Python Client API Documentation is available at python-client.qdrant.tech
pip install qdrant-client
Python client allows you to run same code in local mode without running Qdrant server.
Simply initialize client like this:
from qdrant_client import QdrantClient
client = QdrantClient(":memory:")
# or
client = QdrantClient(path="path/to/db") # Persists changes to disk
Local mode is useful for development, prototyping and testing.
pip install qdrant-client[fastembed]
FastEmbed is a library for creating fast vector embeddings on CPU. It is based on ONNX Runtime and allows to run inference on CPU with GPU-like performance.
Qdrant Client can use FastEmbed to create embeddings and upload them to Qdrant. This allows to simplify API and make it more intuitive.
from qdrant_client import QdrantClient
# Initialize the client
client = QdrantClient(":memory:") # or QdrantClient(path="path/to/db")
# Prepare your documents, metadata, and IDs
docs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]
metadata = [
{"source": "Langchain-docs"},
{"source": "Linkedin-docs"},
]
ids = [42, 2]
# Use the new add method
client.add(
collection_name="demo_collection",
documents=docs,
metadata=metadata,
ids=ids
)
search_result = client.query(
collection_name="demo_collection",
query_text="This is a query document"
)
print(search_result)
FastEmbed can also utilise GPU for faster embeddings. To enable GPU support, install
pip install 'qdrant-client[fastembed-gpu]'
from qdrant_client import QdrantClient
# Initialize the client
client = QdrantClient(":memory:") # or QdrantClient(path="path/to/db")
client.set_model(client.DEFAULT_EMBEDDING_MODEL, providers=["CUDAExecutionProvider", "CPUExecutionProvider"])
Note:
fastembed-gpu
andfastembed
are mutually exclusive. You can only install one of them.If you previously installed
fastembed
, you might need to start from a fresh environment to installfastembed-gpu
.
To connect to Qdrant server, simply specify host and port:
from qdrant_client import QdrantClient
client = QdrantClient(host="localhost", port=6333)
# or
client = QdrantClient(url="http://localhost:6333")
You can run Qdrant server locally with docker:
docker run -p 6333:6333 qdrant/qdrant:latest
See more launch options in Qdrant repository.
You can register and use Qdrant Cloud to get a free tier account with 1GB RAM.
Once you have your cluster and API key, you can connect to it like this:
from qdrant_client import QdrantClient
qdrant_client = QdrantClient(
url="https://xxxxxx-xxxxx-xxxxx-xxxx-xxxxxxxxx.us-east.aws.cloud.qdrant.io:6333",
api_key="<your-api-key>",
)
Create a new collection
from qdrant_client.models import Distance, VectorParams
client.create_collection(
collection_name="my_collection",
vectors_config=VectorParams(size=100, distance=Distance.COSINE),
)
Insert vectors into a collection
import numpy as np
from qdrant_client.models import PointStruct
vectors = np.random.rand(100, 100)
# NOTE: consider splitting the data into chunks to avoid hitting the server's payload size limit
# or use `upload_collection` or `upload_points` methods which handle this for you
# WARNING: uploading points one-by-one is not recommended due to requests overhead
client.upsert(
collection_name="my_collection",
points=[
PointStruct(
id=idx,
vector=vector.tolist(),
payload={"color": "red", "rand_number": idx % 10}
)
for idx, vector in enumerate(vectors)
]
)
Search for similar vectors
query_vector = np.random.rand(100)
hits = client.search(
collection_name="my_collection",
query_vector=query_vector,
limit=5 # Return 5 closest points
)
Search for similar vectors with filtering condition
from qdrant_client.models import Filter, FieldCondition, Range
hits = client.search(
collection_name="my_collection",
query_vector=query_vector,
query_filter=Filter(
must=[ # These conditions are required for search results
FieldCondition(
key='rand_number', # Condition based on values of `rand_number` field.
range=Range(
gte=3 # Select only those results where `rand_number` >= 3
)
)
]
),
limit=5 # Return 5 closest points
)
See more examples in our Documentation!
To enable (typically, much faster) collection uploading with gRPC, use the following initialization:
from qdrant_client import QdrantClient
client = QdrantClient(host="localhost", grpc_port=6334, prefer_grpc=True)
Starting from version 1.6.1, all python client methods are available in async version.
To use it, just import AsyncQdrantClient
instead of QdrantClient
:
from qdrant_client import AsyncQdrantClient, models
import numpy as np
import asyncio
async def main():
# Your async code using QdrantClient might be put here
client = AsyncQdrantClient(url="http://localhost:6333")
await client.create_collection(
collection_name="my_collection",
vectors_config=models.VectorParams(size=10, distance=models.Distance.COSINE),
)
await client.upsert(
collection_name="my_collection",
points=[
models.PointStruct(
id=i,
vector=np.random.rand(10).tolist(),
)
for i in range(100)
],
)
res = await client.search(
collection_name="my_collection",
query_vector=np.random.rand(10).tolist(), # type: ignore
limit=10,
)
print(res)
asyncio.run(main())
Both, gRPC and REST API are supported in async mode. More examples can be found here.
This project uses git hooks to run code formatters.
Install pre-commit
with pip3 install pre-commit
and set up hooks with pre-commit install
.
pre-commit requires python>=3.8