OS type and version: docker container based on python:3.11-slim
Python version: 3.11
pip version: 24.1.1
google-cloud-aiplatform version: 1.67.1
Code example
Hi, we are using gemini models (pro, flash) on VertexAI platform and some of the async requests are being stuck forever. The code of calling the model is the following:
Some of the long-running calls are being caught by this timeout and we were able to retry the method, but some of them are still stuck forever for some reason.
Looks like there is some kind of thread locking inside of the async method from library. Something like the following code:
async def sleep_sync(timeout):
time.sleep(timeout)
return timeout
async def sleep_async(timeout):
await asyncio.sleep(timeout)
return timeout
# No locking, when timeout is reached, we receive exception
await asyncio.wait_for(
sleep_async(10),
timeout=4
)
# This code is being locked by synchronous time.sleep method
await asyncio.wait_for(
sleep_sync(10),
timeout=4
)
Environment details
google-cloud-aiplatform
version: 1.67.1Code example
Hi, we are using gemini models (pro, flash) on VertexAI platform and some of the async requests are being stuck forever. The code of calling the model is the following:
There is no possibility to set the request timeout for this call, so we created our own one with the following code:
Some of the long-running calls are being caught by this timeout and we were able to retry the method, but some of them are still stuck forever for some reason.
Looks like there is some kind of thread locking inside of the async method from library. Something like the following code:
Stack trace