Closed mathcass closed 1 year ago
i am experiencing the same issue, also using FastAPI
The first time it tries, the connection is dropped with a 104 error:
Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
After that, I get the error:
Error transcribing call 128316136 (attempt 2): Invalid file formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']
Even though, from the docs:
File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.
Also, the call length is only 5MB
-rw-r--r-- 1 zwhitchcox zwhitchcox 5.6M Apr 6 14:45 data/recordings/128316136.mp3
It seems the problem is in the openai's servers, because the Invalid file formats
error, does not appear to be in this repo
I start the server with uvicorn (uvicorn app:app --reload --port 5000
), which reloads kills/restarts the server whenever I make a change, but sometimes, when I kill the server, nongracefully, maybe some orphan processes are leftover, because the port is still in use (5000), and so I kill that process.
I'm thinking maybe somehow those sockets that are left open might be communicating with the OpenAI servers, somehow, and maybe OpenAI's servers are blocking requests from my IP address. I don't really know what could be happening, but it seems like that could be the source of the issue, because I sometimes force kill the server in the middle of a transcription during development, to avoid having to pay for whisper API calls that I'm not using.
Not sure if that is helpful, just a little more triage info.
Seeing the above issue intermittently in a local Jupyter notebook.
Error:
File /opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py:529, in APIRequestor.request_raw(self, method, url, params, supplied_headers, files, stream, request_id, request_timeout)
527 raise error.Timeout("Request timed out: {}".format(e)) from e
528 except requests.exceptions.RequestException as e:
--> 529 raise error.APIConnectionError(
530 "Error communicating with OpenAI: {}".format(e)
531 ) from e
532 util.log_debug(
533 "OpenAI API response",
534 path=abs_url,
(...)
537 request_id=result.headers.get("X-Request-Id"),
538 )
539 # Don't read the whole stream for debug logging unless necessary.
APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Main function:
def get_completion(prompt, model="gpt-3.5-turbo"):
messages = [{"role": "user", "content": prompt}]
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=0, # this is the degree of randomness of the model's output
)
return response.choices[0].message["content"]
Full stacktrace:
Versions:
$ pip list | grep openai
openai 0.27.5
$ python3 -V
Python 3.11.3
Any updates on this? We also see this fairly regularly.
I am having the same issue. Making an API call and then waiting for 30 minutes before calling again results in: openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
This is my workaround, I just wrap all of my OpenAI function calls inside this:
async with ClientSession() as s:
openai.aiosession.set(s)
response = await openai.Completion.acreate( ...
This way a new client session is made after waiting and it doesn't use the old one that will fail.
Could you please help me understand why my attempt to apply that to an embedding function throws an error?
''' async def process_inputs(inputs, model_id="text-embedding-ada-002"):
embeddings = []
async with aiohttp.ClientSession() as s:
openai.aiosession.set(s)
for sentence in inputs:
response = await openai.Embedding.create(
engine=deployment_id,
model=model_id,
input=sentence,
max_tokens=100,
temperature=0
)
embeddings.append(response['data'][0]['embedding'])
return embeddings
'''
embeddings = await process_inputs(df['openai'].tolist()) Result: TypeError: object OpenAIObject can't be used in 'await' expression
Could you please help me understand why my attempt to apply that to an embedding function throws an error?
'''
async def process_inputs(inputs, model_id="text-embedding-ada-002"):
embeddings = [] async with aiohttp.ClientSession() as s: openai.aiosession.set(s) for sentence in inputs: response = await openai.Embedding.create( engine=deployment_id, model=model_id, input=sentence, max_tokens=100, temperature=0 ) embeddings.append(response['data'][0]['embedding']) return embeddings
'''
embeddings = await process_inputs(df['openai'].tolist())
Result:
TypeError: object OpenAIObject can't be used in 'await' expression
It needs to be wrapped in an async
function first, and then you can call it using asyncio
It is an async function, but github Is not formatting that first line right
If anyone is looking for a workaround that does not requiring changing to async, the following is working for us. It's the same idea as hc20k's workaround above: https://github.com/openai/openai-python/issues/371#issuecomment-1537622984
Using the support added in v0.27.6 to pass in a session we do the following:
# Pass a new session to the openai module
openai.requestssession = requests.Session()
# Existing code calling openai
response = openai.Completion.create(...)
# Close and reset the session
try:
openai.requestssession.close()
except Exception as e:
logging.exception(e)
openai.requestssession = None
or using the 'with' syntax:
with requests.Session() as session:
openai.requestssession = session
response = openai.Completion.create(...)
openai.requestssession = None
We're not sure if setting the openai.requestssession to None is required but we weren't sure what else might be done with the that attribute in the openai module. In our testing, we are no longer seeing the errors on long-running (web app) threads that make openai calls.
@turnham It'll do the job but one thing you might miss out on is potential speed improvements by reusing persistent connections. The requests docs on Sessions explain this briefly and link to the basic idea. I still think that since OpenAI knows their own server configurations, if they modify the keep-alive settings, that'll have the most improvement on community use.
In my own case, I switched over to using tenacity, since the OpenAI docs recommend it.
@retry(
stop=stop_after_attempt(2),
retry=retry_if_exception_type(openai.error.APIConnectionError),
)
def call_openai():
...
@mathcass +1. Yes it would be great if the openai module would take care of all of this.
I don't think your approach with tenacity retries would have helped with our situation though. Once we had a long running thread get into this state, all retries would fail. So to get that use case working consisently, we had to force the resetting openai's _thread_context.session, by making sure a cached session was never present to be re-used: https://github.com/openai/openai-python/blob/fe3abd16b582ae784d8a73fd249bcdfebd5752c9/openai/api_requestor.py#L79
But also adding retries sounds like something we should be doing regardless of this issue, so thanks for the pointers to tenacity!
Just to add for anybody who comes looking I have this problem when I'm using a VPN, not sure why. If I shut off the VPN the problem goes away. I am outside the US by the way.
EDIT: It now works with VPN lol
The same error was confirmed when using Azure OpenAI's gpt-3.5-turbo. Changing the model version from 0301 to 0601 resolved the issue. Anyone in the same situation may want to try this.
@turnham so we can say the openai client is not thread safe since the openai.requestssession
is global and can be changed in each thread. Is my understanding right?
@pamelafox Since you seem to be the most well versed with this issue - I'd like to ask, should this now work with the openai-python client? I ran into this issue using long running mapreduce calls in langchain. The error I would see is a aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed
and langchain doesn't retry on it.
It typically happens after my client has sent a request to 3.5-turbo-16k (openai) but with a large number of input tokens (10-12k) and after 2-3 minutes it gives me this error, bills me, and I don't end up with a generation.
body="<StreamReader e=ClientPayloadError('Response payload is not completed')>" message='Response payload is not completed'
body="<StreamReader e=ClientPayloadError('Response payload is not completed')>" message='Response payload is not completed'
Traceback (most recent call last):
File "/pkg/modal/_container_entrypoint.py", line 352, in handle_input_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 510, in run_input
value = await res
File "/root/reports.py", line 172, in analyze_document_summarize_llm_chain
res = await chain.acall(inputs={'input_documents': texts}, return_only_outputs=True)
File "/usr/local/lib/python3.9/site-packages/langchain/chains/base.py", line 361, in acall
raise e
File "/usr/local/lib/python3.9/site-packages/langchain/chains/base.py", line 355, in acall
await self._acall(inputs, run_manager=run_manager)
File "/usr/local/lib/python3.9/site-packages/langchain/chains/combine_documents/base.py", line 121, in _acall
output, extra_return_dict = await self.acombine_docs(
File "/usr/local/lib/python3.9/site-packages/langchain/chains/combine_documents/map_reduce.py", line 240, in acombine_docs
map_results = await self.llm_chain.aapply(
File "/usr/local/lib/python3.9/site-packages/langchain/chains/llm.py", line 209, in aapply
raise e
File "/usr/local/lib/python3.9/site-packages/langchain/chains/llm.py", line 206, in aapply
response = await self.agenerate(input_list, run_manager=run_manager)
File "/usr/local/lib/python3.9/site-packages/langchain/chains/llm.py", line 115, in agenerate
return await self.llm.agenerate_prompt(
File "/usr/local/lib/python3.9/site-packages/langchain/chat_models/base.py", line 424, in agenerate_prompt
return await self.agenerate(
File "/usr/local/lib/python3.9/site-packages/langchain/chat_models/base.py", line 384, in agenerate
raise exceptions[0]
File "/usr/local/lib/python3.9/site-packages/langchain/chat_models/base.py", line 485, in _agenerate_with_cache
return await self._agenerate(
File "/usr/local/lib/python3.9/site-packages/langchain/chat_models/openai.py", line 425, in _agenerate
response = await acompletion_with_retry(
File "/usr/local/lib/python3.9/site-packages/langchain/chat_models/openai.py", line 92, in acompletion_with_retry
return await _completion_with_retry(**kwargs)
File "/usr/local/lib/python3.9/site-packages/tenacity/_asyncio.py", line 88, in async_wrapped
return await fn(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/tenacity/_asyncio.py", line 47, in __call__
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 314, in iter
return fut.result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/usr/local/lib/python3.9/site-packages/tenacity/_asyncio.py", line 50, in __call__
result = await fn(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/langchain/chat_models/openai.py", line 90, in _completion_with_retry
return await llm.client.acreate(**kwargs)
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/chat_completion.py", line 45, in acreate
return await super().acreate(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 217, in acreate
response, _, api_key = await requestor.arequest(
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 382, in arequest
resp, got_stream = await self._interpret_async_response(result, stream)
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 729, in _interpret_async_response
(await result.read()).decode("utf-8"),
File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1037, in read
self._body = await self.content.read()
File "/usr/local/lib/python3.9/site-packages/aiohttp/streams.py", line 349, in read
raise self._exception
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 722, in _interpret_async_response
await result.read()
File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1037, in read
self._body = await self.content.read()
File "/usr/local/lib/python3.9/site-packages/aiohttp/streams.py", line 375, in read
block = await self.readany()
File "/usr/local/lib/python3.9/site-packages/aiohttp/streams.py", line 397, in readany
await self._wait("readany")
File "/usr/local/lib/python3.9/site-packages/aiohttp/streams.py", line 304, in _wait
await waiter
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed
If this has indeed been fixed, then perhaps updating my openai lib version would solve the issue within my chains. I'm unsure though as this isn't easily reproducible so I cannot test it.
In the latest version i.e 0.28.0, you can also pass a request_timeout = <timeout in sec>
and have the session closed after the timeout.
@microsoftbuild Looks like this issue could bring up issues again, wrt OP: https://github.com/openai/openai-python/pull/387
Also, we do want high timeouts as our llm chains can potentially take many minutes to run, but not always.
@ShantanuNair In that case, you could use a request_timeout = <timeout in sec>
param in the following workaround suggested by @turnham :
with requests.Session() as session:
openai.requestssession = session
response = openai.Completion.create(...)
openai.requestssession = None
@microsoftbuild See this. I already have 600s set as the timeout, and this issue also impacts retries on 502 from Cloudflare.
I then tried specifying a request_timeout parameter for the OpenAI API request, but that caused every request to timeout, due to this issue: https://github.com/openai/openai-python/pull/387
Appreciate your help!
This should be fixed in the beta of our upcoming v1.0.0; can you try it out and let us know whether or not it seems to be resolved?
Describe the bug
As we've used the openai.ChatCompletion.create (with gpt-3.5-turbo), we've had intermittent
without a clear reproduction. At first I thought it was https://github.com/openai/openai-python/issues/91 and due to too many open connections to the OpenAI servers. Now I think it looks more like https://github.com/openai/openai-python/issues/368 instead, but I have some hypotheses about it. I'm opening a new issue separate from https://github.com/openai/openai-python/issues/368 in case they're different. If this is a duplicate, we can feel free to tack on my details there.
My hypothesis is that if you have a long running process (like a web server), and it calls out to OpenAI, that periods of inactivity cause the server side to terminate the connection and it takes a long time for the client to reestablish the connection. I dug into related issues on the requests side (like this one, https://github.com/psf/requests/issues/4937) that hinted at the root cause. Essentially, what I think is happening is that,
I believe that the OpenAI servers are terminating the connection after a brief time (perhaps minutes) but the client still tries to keep it alive.
The reason why I think this is a bug worth reporting is that I think you could modify the client code so it responds more gracefully to these server-side settings. Changing some of the keep-alive settings from the default ones would help out several folks using this.
To Reproduce
Code snippets
No response
OS
Linux
Python version
Python v3.8
Library version
openai-python 0.27.2