qdrant / qdrant-client

Python client for Qdrant vector search engine
https://qdrant.tech
Apache License 2.0
783 stars 120 forks source link

Requests Timed out #394

Closed FrancescoSaverioZuppichini closed 9 months ago

FrancescoSaverioZuppichini commented 11 months ago

Hi there, using the client like this

from qdrant_client import QdrantClient

url = "https://<QDRANT_SERVER>"
client = QdrantClient(
    url=url,
    timeout=10,
)

client.<do_stuff>

results in timeout in every requests, if I use my local one running I don't have any issue.

If I use curl I don't have timeouts, I was wondering if you have any idea why calling the APIs with the sdks doesn't work but with curl yes. Maybe some options for https in the client?

Thanks you so much.

I cannot share the server URL because is deployed on Azure in our private network, but it is just qdrant in docker

Checking inside the container, there is not log for the requests, while if I use curl I can see that it hits the container

FrancescoSaverioZuppichini commented 11 months ago

I can confirm it is an issue with the sdks, calling the server directly works. For example,

import requests
import json

url = f"{url}/collections/test_2"

payload = json.dumps({"vectors": {"size": 100, "distance": "Dot"}})
headers = {"Content-Type": "application/json"}

response = requests.request("PUT", url, headers=headers, data=payload)

print(response.text)

works

FrancescoSaverioZuppichini commented 11 months ago

looks like one has to set https = True even if the url has https - could be considered a bug?

Thiru-GVT commented 11 months ago

Same error here, HTTP works but the client is broken

joein commented 11 months ago

Hi

I'll check it, thanks

In the mean time, @Thiru-GVT does setting https=True fixes the issue for you as it does for @FrancescoSaverioZuppichini ?

joein commented 11 months ago

@FrancescoSaverioZuppichini @Thiru-GVT could you please provide qdrant version you are using and qdrant-client version?

Btw, in some setups azure deployment requires setting port as 443 instead of 6333

joein commented 11 months ago

I've checked various configurations on qdrant-client==1.7.0

client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key')
client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key', https=True)
client = QdrantClient(url='https://aws.aws.com:6333')
client = QdrantClient(url='https://aws.aws.com:6333', https=True)

client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key', prefer_grpc=True)
client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key', https=True, prefer_grpc=True)
client = QdrantClient(url='https://aws.aws.com:6333', prefer_grpc=True)
client = QdrantClient(url='https://aws.aws.com:6333', https=True, prefer_grpc=True)

All of them seem to work correctly and https is enabled if the scheme contains 'https' regardless of https flag being set or not

019ec6e2 commented 11 months ago

The same problem, get_collections works well but create_collection times out. Sync and async clients in python.

PS. Had to recreate and move the cluster from AWS to GCloud and now it's working properly

Thiru-GVT commented 11 months ago

Im using 1.6.9 but upgraded to 1.7.0 and i still experience the same issue.

I'm using http://localhost:6333/ so not https. Its with the upsert operation

FrancescoSaverioZuppichini commented 11 months ago

@FrancescoSaverioZuppichini @Thiru-GVT could you please provide qdrant version you are using and qdrant-client version?

Btw, in some setups azure deployment requires setting port as 443 instead of 6333

yes but my assumption was that if I pass a url with protocol https then the sdk should understand that I want to use https 😓

joein commented 11 months ago

@Thiru-GVT could you provide a python code sample to reproduce or just show us the code you are using?

joein commented 11 months ago

@FrancescoSaverioZuppichini sorry, I am not sure whether we are on the same track right now

looks like one has to set https = True even if the url has https - could be considered a bug?

I was not able to reproduce this one.

Could you provide an example which does not work with <url containing https scheme>, but works with <url containing https scheme> + https=True

If you have some sensitive url, could you please replace sensitive parts with some gibberish so I could test the format of the url?

like if you have https://fc055abe-cbcf-4b57-a178-224713c9255d.europe-west3-0.gcp.cloud.qdrant.io send <https://< uuid >.< words with digits and hyphens >.gcp.cloud.qdrant.io:6333> or something like this ?

HaoES commented 11 months ago

@joein I am experiencing the same issue, here is my python code:

client = QdrantClient(
    url="https://c445f1e5-8686-4081-8cab-6ef96ceac615.europe-west3-0.gcp.cloud.qdrant.io:6333",
    api_key=os.environ["QDRANT_API"],
    https=True,
)

embeddings = HuggingFaceEmbeddings(model_name="NeuML/pubmedbert-base-embeddings")

vectorstore = Qdrant(
    client=client,
    collection_name="med_assistant",
    embeddings=embeddings,
)

and then later when I do vectorstore.add_texts(chunks) it give me ResponseHandlingException: The read operation timed out

joein commented 11 months ago

Hi @HaoES

It seems like you are using some third party library, not pure python client, aren't you?

Is it langchain or maybe something else?

Could you please provide a launchable code sample (with all imports included?) and maybe some data you are using

Collection info might be useful as well

HaoES commented 11 months ago

Hey @joein I am using langchain and streamlit, when I use only a short pdf it works without any problem, but when I use multiple files (around 40 pages each) it times out. Here is my whole app.py code:

import streamlit as st
from pypdf import PdfReader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Qdrant
import qdrant_client
from qdrant_client import QdrantClient
import os

# create a qdrant_client

client = QdrantClient(
    url="https://c445f1e5-8686-4081-8cab-6ef96ceac615.europe-west3-0.gcp.cloud.qdrant.io:6333",
    api_key=os.environ["QDRANT_API"],
    https=True,
)

embeddings = HuggingFaceEmbeddings(model_name="NeuML/pubmedbert-base-embeddings")

vectorstore = Qdrant(
    client=client,
    collection_name="med_assistant",
    embeddings=embeddings,
)

def get_pdf_text(docs):
    text = ""
    for doc in docs:
        pdf_reader = PdfReader(doc)
        for page in pdf_reader.pages:
            text += page.extract_text()
    return text

def get_chunks(text):
    text_splitter = CharacterTextSplitter(
        separator="\n", chunk_size=1000, chunk_overlap=200, length_function=len
    )
    chunks = text_splitter.split_text(text)
    return chunks

def main():
    st.set_page_config(page_title="Med Prep", page_icon=":medical_symbol:")
    st.header("Med Prep :medical_symbol:")
    st.text_input("Ask questions about your documents:")

    with st.sidebar:
        st.subheader("Your documents")
        docs = st.file_uploader(
            "Upload your course materials here then click 'Process':",
            accept_multiple_files=True,
            type="pdf",
        )
        if st.button("Process"):
            with st.spinner("Processing files ..."):
                # get pdf files:
                raw_text = get_pdf_text(docs)
                # get chunks of texts
                chunks = get_chunks(raw_text)
                # get vectorstore
                vectorstore.add_texts(chunks)

if __name__ == "__main__":
    main()

What do you mean by collection info?

Wamy-Dev commented 11 months ago

Having a similar problem, embeddings seem to give this error:

vector_size = len(partial_embeddings[0]), IndexError: list index out of range.

It seems to be related due to me using http, and only fails when using large requests to qdrant for many or large files.

FrancescoSaverioZuppichini commented 11 months ago

@FrancescoSaverioZuppichini sorry, I am not sure whether we are on the same track right now

looks like one has to set https = True even if the url has https - could be considered a bug?

I was not able to reproduce this one.

Could you provide an example which does not work with <url containing https scheme>, but works with <url containing https scheme> + https=True

If you have some sensitive url, could you please replace sensitive parts with some gibberish so I could test the format of the url?

like if you have https://fc055abe-cbcf-4b57-a178-224713c9255d.europe-west3-0.gcp.cloud.qdrant.io send <https://< uuid >.< words with digits and hyphens >.gcp.cloud.qdrant.io:6333> or something like this ?

I am sorry but I can't provide you a url, so if you pass https://blablabla and you don't set https=True, it works for you? Thank you so much

romqn1999 commented 10 months ago

Im using 1.6.9 but upgraded to 1.7.0 and i still experience the same issue.

I'm using http://localhost:6333/ so not https. Its with the upsert operation

I'm uncertain whether my issue aligns with yours, but I encountered a situation where upserting a substantial volume of data exceeded the specified timeout of 5 seconds. To address this, I opted for a solution utilizing gRPC, with the prefer_grpc=True setting.

joein commented 10 months ago

hi @Wamy-Dev you have an index error, which means that partial_embeddings are empty, not sure how it relates to http

joein commented 10 months ago

hi @HaoES , @romqn1999

you might need to check this page

it might be related to network issues / non-optimal configuration of the collection / etc

you can also try to decrease batch size

sharing collection info can also be helpful (by this I mean e.g. output of QdrantClient.get_collection method)

joein commented 10 months ago

hi @FrancescoSaverioZuppichini

sorry for the late response

I've checked various configurations on qdrant-client==1.7.0

client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key')
client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key', https=True)
client = QdrantClient(url='https://aws.aws.com:6333')
client = QdrantClient(url='https://aws.aws.com:6333', https=True)

client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key', prefer_grpc=True)
client = QdrantClient(url='https://aws.aws.com:6333', api_key='api_key', https=True, prefer_grpc=True)
client = QdrantClient(url='https://aws.aws.com:6333', prefer_grpc=True)
client = QdrantClient(url='https://aws.aws.com:6333', https=True, prefer_grpc=True)

All of them seem to work correctly and https is enabled if the scheme contains 'https' regardless of https flag being set or not

As I wrote here, I had tried these configurations and they were correct: all of them set QdrantClient._client._https flag correctly and constructed QdrantClient._client.rest_uri.

If you can't share even an obfuscated uri, then it might be helpful if you could check values of the mentioned variables on your own and write the result here

FrancescoSaverioZuppichini commented 9 months ago

sorry for the late reply as well, noticed the problem once again in production - trying to pin the version and maybe increase the timeout

joein commented 9 months ago

@FrancescoSaverioZuppichini it seems for me that we can proceed like this infinitely, I suggest you to concat me in discord so we could solve the issue in real-time, and then publish a post-mortem here

To find me in discord go to our discord channel and look for george.panchuk.

FrancescoSaverioZuppichini commented 9 months ago

@FrancescoSaverioZuppichini it seems for me that we can proceed like this infinitely, I suggest you to concat me in discord so we could solve the issue in real-time, and then publish a post-mortem here

To find me in discord go to our discord channel and look for george.panchuk.

thank you so much for the support, apparently colleagues changed the qdrant port without telling me - no s*hit I was getting timeout haha

I'll close it for now since it seems to be fixed

m-navarro93 commented 8 months ago

Hi!

I have an additional solution to try. Looking through the SDK code, if you don't set port=None when instantiating a QdrantClient object, the string :6333 gets added onto the request URL.

This was causing timeout issues for us with our hosted database. So...

https://my-url.com:6333/collections -> timeout error https://my-url.com/collections -> Success!

Using the test code below allowed me to connect to our hosted database.

qdrant_api_key = "my-api-key"
qdrant_url = "https://my-url.com"
dev_client = QdrantClient(url=qdrant_url, port=None, api_key=qdrant_api_key)
dev_client.get_collections()

Hopefully this helps!

TalentumSebasG commented 6 months ago

Hi!

I have an additional solution to try. Looking through the SDK code, if you don't set port=None when instantiating a QdrantClient object, the string :6333 gets added onto the request URL.

This was causing timeout issues for us with our hosted database. So...

https://my-url.com:6333/collections -> timeout error https://my-url.com/collections -> Success!

Using the test code below allowed me to connect to our hosted database.

qdrant_api_key = "my-api-key"
qdrant_url = "https://my-url.com"
dev_client = QdrantClient(url=qdrant_url, port=None, api_key=qdrant_api_key)
dev_client.get_collections()

Hopefully this helps!

This is the solution, thank you. port = None

iremnuy commented 6 months ago

Hello !

This issue persists for me but I don't get a timeout when I call get_collection, problem occurs during the search request.

Here is the result of get_collection:

status=<CollectionStatus.GREEN: 'green'> optimizer_status=<OptimizersStatusOneOf.OK: 'ok'> vectors_count=1775036 indexed_vectors_count=1775036 points_count=1775036 segments_count=19 config=CollectionConfig(params=CollectionParams(vectors=VectorParams(size=1024, distance=<Distance.COSINE: 'Cosine'>, hnsw_config=None, quantization_config=None, on_disk=True), shard_number=2, sharding_method=None, replication_factor=1, write_consistency_factor=1, read_fan_out_factor=None, on_disk_payload=True, sparse_vectors=None), hnsw_config=HnswConfig(m=16, ef_construct=100, full_scan_threshold=10000, max_indexing_threads=0, on_disk=False, payload_m=None), optimizer_config=OptimizersConfig(deleted_threshold=0.2, vacuum_min_vector_number=1000, default_segment_number=0, max_segment_size=None, memmap_threshold=None, indexing_threshold=20000, flush_interval_sec=5, max_optimization_threads=None), wal_config=WalConfig(wal_capacity_mb=32, wal_segments_ahead=0), quantization_config=None) payload_schema={}

As can be seen the collection consist of 1.8M vectors but I can do the similarity search within 5 secs in qdrant interface. Also my call sometimes (like 1/20) does not get a timeout error and I get the scored points. And also a friend that uses exactly the same config and network most of the time does not get a timeout at all.

Here is the call where I stuck :

hits = self.client.search( collection_name=self.collection_name, score_threshold=0.6, query_vector=query_embedding.tolist()[0], with_payload=True, limit=10,
timeout=20000, )

Do you have any ideas ??

joein commented 6 months ago

Hi @iremnuy

You're using on_disk option with your vectors, which requires fast disks, what are your disk I/OPS ? Also to speed up the search you might want to look into the quantization options such as scalar or binary quantization

iremnuy commented 6 months ago

Data reside in 50 GB SSD, iostat output is like the following :

Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util xvda 9.94 86.00 0.24 2.33 1.77 8.65 5.91 215.81 5.74 49.26 2.62 36.52 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.03 1.67

Therefore total IOPS is around 16.

I will check for quantization options, and let you know if it helps, thank you !

joein commented 6 months ago

Are you sure that these are the correct stats? It is extremely slow, even hdds are faster

iremnuy commented 6 months ago

I checked my AWS volumes, I have 4 of them with each 100/100/150/180 IOPS respectively, they still seem to be slow, do you have any suggestion on the threshold of IOPS that will satisfy my needs? (Searching through 1.8M vectors in 2-3 secs in my case), I also checked the read/sec graph, it occasionally peeks at 16K IOPS.

iremnuy commented 6 months ago

I solved the problem with scalar quantization, I am able to do similarity search among 1.8m vectors in 0.85 secs on average with a ssd that supports up to 3k IOPS, Thank you.