Open shenhai-ran opened 4 months ago
The QueryFusionRetriever
expects an instance of httpx.AsyncClient
when using use_async=True
because asynchronous operations require an asynchronous HTTP client to handle non-blocking requests. This ensures that the asynchronous nature of the operations is maintained, allowing for concurrent execution without blocking the main thread.
To resolve the TypeError
when using a custom http_client
with a self-signed certificate, you need to ensure that the custom http_client
is an instance of httpx.AsyncClient
if use_async=True
. Additionally, you should configure the httpx.AsyncClient
to trust the self-signed certificate.
Here is an example of how you can create a custom httpx.AsyncClient
with a self-signed certificate:
import httpx
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
# Create a custom AsyncClient with a self-signed certificate
custom_async_http_client = httpx.AsyncClient(verify="<my/certificate>")
# Pass the custom AsyncClient to OpenAI and OpenAIEmbedding
llm = OpenAI(
async_http_client=custom_async_http_client,
)
embed_model = OpenAIEmbedding(
model="text-embedding-3-small",
dimensions=512,
http_client=custom_async_http_client,
)
# Use the rest of your code as is
documents_1 = SimpleDirectoryReader(
input_files=["<path/to/file1>"]
).load_data()
documents_2 = SimpleDirectoryReader(
input_files=["<path/to/file2>"]
).load_data()
index_1 = VectorStoreIndex.from_documents(
documents_1, embed_model=embed_model, show_progress=True
)
index_2 = VectorStoreIndex.from_documents(
documents_2, embed_model=embed_model, show_progress=True
)
retriever = QueryFusionRetriever(
[index_1.as_retriever(), index_2.as_retriever()],
similarity_top_k=2,
num_queries=4, # set this to 1 to disable query generation
use_async=True,
verbose=True,
llm=llm,
# query_gen_prompt="...", # we could override the query generation prompt here
)
nodes_with_scores = await retriever.retrieve("How do I setup a chroma vector store?")
for node in nodes_with_scores:
print(f"Score: {node.score:.2f} - {node.text[:100]}...")
In this example, verify="<my/certificate>"
is used to specify the path to your self-signed certificate. Make sure to use this approach only in a secure and controlled environment, as disabling SSL verification can expose you to security risks [1].
To continue talking to Dosu, mention @dosu.
You need to pass the sync and async custom client to both the llm and embeddings
You'll see in your traceback it's pointing towards embeddings. (Although looking at it now, the OpenAIEmbedding class doesn't take an async http client yet, so that would have to be added)
@logan-markewich Thanks for your feedback, I have tired to include the sync and async custom client to both the llm and embeddings as you suggested, but I still get the same error...
If you don't use self signed certificate, can you try in the client with verify=False
? Do you get result without similar errors?
Thanks
Bug Description
I don't know if this is a bug or I use it wrong. I have a local self-signed certificate, so I have to modify the
http_client
inllm
. Later if I useQueryFusionRetriever
withuse_async=True
I got error asI don't understand why
http_client
expects an instance ofhttpx.AsyncClient
I use the example at Simple Fusion Retriever, and my code example is bellow.
I tried to play with
http_client
andasync_http_client
inOpenAI
object, but none of them works. Only settinguse_async=False
worksVersion
0.10.37
Steps to Reproduce
Error message as follow:
Relevant Logs/Tracbacks
No response