Closed vmarchenkoff closed 1 month ago
The issue seems to be related to the timeout error when using OpenAIEmbeddings
with the Infinity server. Ensure that the Infinity server is properly configured to handle requests from OpenAIEmbeddings
. Specifically, check the following:
Server Configuration: Verify that the Infinity server is running and accessible at the specified openai_api_base
URL.
Timeout Settings: Increase the timeout settings in the OpenAIEmbeddings
client to allow more time for the server to respond. You can do this by setting the timeout
parameter when initializing the OpenAIEmbeddings
client.
from langchain_openai import OpenAIEmbeddings
openai_api_key = "DUMMY"
openai_api_base = "http://<BASE_URL>:8000"
client_emb = OpenAIEmbeddings(api_key=openai_api_key, base_url=openai_api_base, timeout=60)
client_emb.embed_query('Всем привет')
Server Logs: Check the server logs for any errors or warnings that might indicate why the request is timing out. This can provide more insight into whether the issue is with the server configuration or the client request.
Network Issues: Ensure there are no network issues or firewalls blocking the request from the client to the server.
By addressing these points, you should be able to resolve the timeout error and use OpenAIEmbeddings
with the Infinity server.
/README.md /docs/docs/index.md /docs/assets/openapi.json /docs/benchmarks/simple_app.py /libs/infinity_emb/tests/end_to_end/test_optimum_embedding.py /docs/docs/integrations.md /docs/docs /libs/infinity_emb /libs/infinity_emb/tests/end_to_end/test_authentication.py /docs/docs/deploy.md /docs /docs/docs/python_engine.md /libs/infinity_emb/tests/end_to_end/test_api_with_dummymodel.py /.github/ISSUE_TEMPLATE/bug-report.yml /libs/infinity_emb/tests/data/benchmark /libs/infinity_emb/infinity_emb/fastapi_schemas/convert.py /.github/ISSUE_TEMPLATE/new-model-addition.yml /libs /docs/benchmarks /docs/assets /.github/workflows/release.yaml /libs/infinity_emb/infinity_emb/engine.py /libs/infinity_emb/Makefile /libs/infinity_emb/infinity_emb/fastapi_schemas
Guess it's because of OpenAI is trying to download something that cannot be downloaded (possibly because of network's settings)
~/.local/lib/python3.9/site-packages/tiktoken_ext/openai_public.py:72, in cl100k_base()
71 def cl100k_base():
---> 72 mergeable_ranks = load_tiktoken_bpe(
73 "https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken",
74 expected_hash="223921b76ee99bde995b7ff738513eef100fb51d18c93597a113bcffe865b2a7",
75 )
Okay, i've realized that some params must be added:
tiktoken_enabled = False
model = model
but how it must be properly configurated? model-path from docker run command doesn't help.
Search for related issues here before opening.
langchain pretokenizes the text with the tiktoken tokenizer
As I understood, there is no way to use OpenAIEmbeddings with Infinity instance because of model's specific tokenization , InfinityEmbeddings was implemented to solve this (https://github.com/michaelfeil/infinity/issues/36). But the api-key feature is not implemented yet in langchain's InfinityEmbeddings.
Sorry for inconvenience, i saw the issue above before opening this one, but translated it to another problem.
Thank you for you answer and for your work in general, this is the beautiful project, very important and useful.
@vmarchenkoff Makes sense! But the api-key feature is not implemented yet in langchain's InfinityEmbeddings.
- Correct.
I would recommend using https://github.com/michaelfeil/infinity/tree/main/libs/client_infinity/infinity_client / pip install infinity_client
for usage.
Thank you!
It looks much more customisable and serious, guess I have to use default clients for vector DB and Embeddings in my RAG project instead of third-party integrations.
System Info
michaelf34/infinity:latest
Information
Tasks
Reproduction
bash docker run -it --gpus all -v ~/llms/:/app/.cache -p 8000:8000 michaelf34/infinity:latest v2 --model-id ~/llms/multilingual-e5-large --port 8000
works just fine.
works fine as well, but:
fails with Time Out error. What did i miss here? The reason that i would like to use OpenAIEmbeddings instead of InfinityEmbeddings is possibility to use api-key which is not incorporated to langchain's InfinityEmbeddings.
Thank you in advance and thank you for this beautiful project!
Expected behavior
The same as for OpenAI client and InfinityEmbeddings.