langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
89.3k stars 14.08k forks source link

Azure OpenAI Embedding langchain.embeddings.openai.embed_with_retry won't provide any embeddings after retries. #2493

Closed masoumi76 closed 3 months ago

masoumi76 commented 1 year ago

I have the following code:

docsearch = Chroma.from_documents(texts, embeddings,persist_directory=persist_directory)

and get the following error:

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Requests to the Embeddings_Create Operation under Azure OpenAI API version 2022-12-01 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 3 seconds. Please contact Azure support service if you would like to further increase the default rate limit.

The length of my texts list is less than 100 and as far as I know azure has a 400 request/min limit. That means I should not receive any limitation error. Can someone explain me what is happening which results to this error?

After these retires by Langchain, it looks like embeddings are lost and not stored in the Chroma DB. Could someone please give me a hint what I'm doing wrong?

using langchain==0.0.125

Many thanks

xsser commented 1 year ago

+1

Thystler commented 1 year ago

+1

masoumi76 commented 1 year ago

any suggestion would be very appreciated!

Peter-Devine commented 1 year ago

+1

nepomny commented 1 year ago

I set max_retries = 10. I am still getting "_Retrying langchain.embeddings.openai.embed_withretry" messages, but I was able to complete the index creation.

embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", chunk_size=1, max_retries=10)

aiswaryasankar commented 1 year ago

+10000

nitsahoo-hs commented 1 year ago

Any solution to fix this issue ? +1

zxs731 commented 1 year ago

So far as I know Azure OpenAI Embedding is different with Open AI official embedding api . it doesn't support us to use Chroma.from_documents instead we need to use Azure open ai embedding api to do it.

Levilian commented 1 year ago

+1111

EricLee911110 commented 1 year ago

I tried something like this: embeddings = OpenAIEmbeddings() vector_store = FAISS.from_texts(texts=["example1", "example2"], embedding=embeddings) and vector_store = Chroma.from_texts(texts=["example1", "example2"], embedding=embeddings)

got: Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..

I'm passing a list that has a length of 2, and it is giving me RateLimitError.

Tried two versions of Langchain, 0.0.162 and 0.0.188, and both appeared with the same error.

tim-g-provectusalgae commented 1 year ago
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised APIError: The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 25115737c4fe3e6d4deef4961066ba2e in your email.) {
  "error": {
    "message": "The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 25115737c4fe3e6d4deef4961066ba2e in your email.)",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 25115737c4fe3e6d4deef4961066ba2e in your email.)', 'type': 'server_error', 'param': None, 'code': None}} {'Date': 'Fri, 16 Jun 2023 01:43:24 GMT', 'Content-Type': 'application/json', 'Content-Length': '366', 'Connection': 'keep-alive', 'access-control-allow-origin': '*', 'openai-organization': 'provectus-algae-pem6gx', 'openai-processing-ms': '5602', 'openai-version': '2020-10-01', 'strict-transport-security': 'max-age=15724800; includeSubDomains', 'x-ratelimit-limit-requests': '3000', 'x-ratelimit-remaining-requests': '2999', 'x-ratelimit-reset-requests': '20ms', 'x-request-id': '25115737c4fe3e6d4deef4961066ba2e', 'CF-Cache-Status': 'DYNAMIC', 'Server': 'cloudflare', 'CF-RAY': '7d7f5c6ebacfa83e-SYD', 'alt-svc': 'h3=":443"; ma=86400'}.

Killing me, I've sent through a single request (on a paid plan) and am being rate limited on embeddings.

AaronWard commented 1 year ago

After a bit of digging i found this i've can suspect 2 causes:

  1. If you are using credits and they run out and you go on a pay-as-you-go plan with OpenAI, you may need to make a new API key
  2. Hitting rate limit of requests per minute. Found this notebook from OpenAI explaining ways to get around it. Haven't tested it yet but will report back if i make any headway How_to_handle_rate_limits.ipynb

Will try implement a fix on limiting the rate of request made per minute (as if the langchain community doesn't already have one somewhere)

meanirban100 commented 1 year ago

+1

aiakubovich commented 1 year ago

getting for FAISS.from_documents(data, embeddings):

Traceback (most recent call last):
  File "/app/scheduler/4_generate_embeddings.py", line 52, in <module>
    vectors = FAISS.from_documents(data, embeddings)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/vectorstores/base.py", line 332, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 517, in from_texts
    embeddings = embedding.embed_documents(texts)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 452, in embed_documents
    return self._get_len_safe_embeddings(texts, engine=self.deployment)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 302, in _get_len_safe_embeddings
    response = embed_with_retry(
               ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 97, in embed_with_retry
    return _embed_with_retry(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 325, in iter
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 158, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 95, in _embed_with_retry
    return embeddings.client.create(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/api_resources/embedding.py", line 33, in create
    response = super().create(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
                           ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 230, in request
    resp, got_stream = self._interpret_response(result, stream)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 624, in _interpret_response
    self._interpret_response_line(
  File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 687, in _interpret_response_line
    raise self.handle_error_response(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 337, in handle_error_response
    raise error.APIError(
openai.error.APIError: Invalid response object from API: '{ "statusCode": 500, "message": "Internal server error", "activityId": "......." }' (HTTP response code was 500)
marielaquino commented 1 year ago

Getting the same error using Azure OpenAI with openai.api_version = "2023-05-15"

Creating my embeddings:

from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(chunk_size=1, openai_api_version=openai.api_version, openai_api_key=openai.api_key, openai_api_type=openai.api_type,
 openai_api_base=openai.api_base, deployment="ChatGPTEmbeddings", model="text-embedding-ada-002")

Creating vector store index:

 index = VectorstoreIndexCreator(
    embedding = embeddings,
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

Receiving this error on loop, cell running for 1minute and 51 seconds. Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Requests to the Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. Operation under Azure OpenAI API version 2023-05-15 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 1 second. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit..

JosefButts commented 1 year ago

+1000

theNullP0inter commented 11 months ago

+1

niznet89 commented 11 months ago

+1

OpenAI's developer experience is pretty frustrating.

zxs731 commented 11 months ago

There are two possible solutions:

  1. You can request an increase in quotas, which has been confirmed by MS support.
  2. Starting from July, Azure open AI support embedding with a chunk size of 16. You can find detailed usage information at the following reference: https://m.bilibili.com/video/BV1oP411r7g6 Unfortunately, the video at 1:30 only contains Chinese content.
HowdyHuang commented 11 months ago

I also meet this problem. But when I retry later, the bug is disappear. SOS

spierp-hd commented 11 months ago

I'm having this issue today, but not yesterday. 2023-08-08 14:56:18 INFO error_code=429 error_message='Requests to the Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. Operation under Azure OpenAI API version 2023-05-15 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 2 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.' error_param=None error_type=None message='OpenAI API error received' stream_error=False 2023-08-08 14:56:18 WARNING Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised RateLimitError: Requests to the Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. Operation under Azure OpenAI API version 2023-05-15 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 2 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit..

parth-patel2023 commented 11 months ago

I also meet this problem. But when I retry later, the bug is disappear. SOS

Can you please give me a source code which you used ?

HowdyHuang commented 11 months ago

I also meet this problem. But when I retry later, the bug is disappear. SOS

Can you please give me a source code which you used ?

My error report is slightly different from the title. I think it is a network problem caused by the langchain library. I cannot reproduce this problem with the previous code.

error like this: Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised APIError: Invalid response object from API: '{ "statusCode": 500, "message": "Internal server error", "activityId": "xxxx" }' (HTTP response code was 500).

This bug doesn't happen again and doesn't affect me. Thank you~

aiakubovich commented 11 months ago

OpenAI API limit is big problem. But OpenAI embeddings are not the best, so it can make sense just to use free one (see https://huggingface.co/spaces/mteb/leaderboard)

meakshayraut commented 10 months ago

Define following values in the code 👍:

openai.api_type = "azure" os.environ["OPENAI_API_TYPE"] = "azure" os.environ["OPENAI_API_KEY"] = "your api key" os.environ["OPENAI_API_BASE"] = "put yours" os.environ["OPENAI_API_VERSION"] = "2023-03-15-preview"

llm = AzureOpenAI( api_key="your api key", api_base="put yours", api_version="2023-03-15-preview", deployment_name="name of the deployment")

llm_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", chunk_size = 1)

this will definitely work with chroma,faisss db

elorberb commented 8 months ago

Someone solved the issue?

meakshayraut commented 8 months ago

@elorberb Define following values in the code 👍:

openai.api_type = "azure" os.environ["OPENAI_API_TYPE"] = "azure" os.environ["OPENAI_API_KEY"] = "your api key" os.environ["OPENAI_API_BASE"] = "put yours" os.environ["OPENAI_API_VERSION"] = "2023-03-15-preview"

llm = AzureOpenAI( api_key="your api key", api_base="put yours", api_version="2023-03-15-preview", deployment_name="name of the deployment")

llm_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", chunk_size = 1)

zxs731 commented 8 months ago

In order to improve the embedding performance, you can set the chunk_size to 16, but first need to update the version to "2023-07-01-preview", os.environ["OPENAI_API_VERSION"] = "2023-07-01-preview"

and deployment name should not be forgotten,

embeddings = OpenAIEmbeddings(
        deployment="your deployment name",
        model="text-embedding-ada-002",
        chunk_size=16
)
meakshayraut commented 8 months ago

What actually 16 does here?

zxs731 commented 8 months ago

the number of tasks processed in parallel

meakshayraut commented 8 months ago

Ok Thanks

Elikyals commented 7 months ago

add time.sleep(7)

ghost commented 6 months ago

I got same problem. Someone solved the issue?