Closed kmtechsupport closed 6 months ago
Thanks for reporting pls share all the repro steps and req.txt file
@kmtechsupport Hi, please share your app name and invocation id, sku type, etc. This could be worker memory overloaded or network congestion, etc. With more details, we can take a closer look.
@kmtechsupport Hi, please share your app name and invocation id, sku type, etc. This could be worker memory overloaded or network congestion, etc. With more details, we can take a closer look.
App name : https://getsentinelthreats.azurewebsites.net (this is the sandbox app being used for testing) Invocation Id of one of the failed instances: 4446776e217755dfcb12b5943c9bd841
Let me know if you need more info
Thanks for reporting pls share all the repro steps and req.txt file @bhagyshricompany Requirements:
azure-functions
azure-functions-durable
pandas
requests
numpy
aiohttp
asyncio
xmltodict
Steps to reproduce is something like the following:
async def GetData(input, session):
requestPayload = {
'name': input
}
requestData = json.dumps(requestPayload)
async with session.post(ENDPOINT, data=requestData) as resp:
return await resp.json()
async def main(mytimer: func.TimerRequest, starter: str) -> None:
async with ClientSession() as session:
try:
backupData = await asyncio.gather(
*[
GetData(input, session)
for item in items
],
return_exceptions=True
)
except Exception as e:
logging.info(repr(e))
@kmtechsupport Hi, please share your app name and invocation id, sku type, etc. This could be worker memory overloaded or network congestion, etc. With more details, we can take a closer look.
App name : https://getsentinelthreats.azurewebsites.net (this is the sandbox app being used for testing) Invocation Id of one of the failed instances: 4446776e217755dfcb12b5943c9bd841
Let me know if you need more info
@YunchuWang Not sure if it's worth mentioning but the issue appears pretty much as soon as the coroutines exceeds 5. But is inconsistent so sometimes I'll get 50 successes back sometimes 10.
Another thing maybe worth mentioning is I have also tried changing this to a durable function with the orchestrator spawning these coroutines as activity and functions and it exhibits the exact same behavior. At a certain point the activity functions simply start timing out.
I have also tried chunking the input by not spawning all 100 coroutines at once but rather processing the array in chunks of 5. I have tried this method in both the single script and the durable function methods and they both do not fix the issue.
@bhagyshricompany any luck with this one?
@gavin-aguiar pls comment
@bhagyshricompany will this be looked into or do I need to explore other solutions?
@vrdmr pls comment.
sorry for late reply. Taking a look now.
@kmtechsupport i am unable to repro it with 20 concurrent coroutines (each sleep for 5 seconds) in a linux consumption app.
import azure.functions as func
import logging
from aiohttp import ClientSession
import asyncio
app = func.FunctionApp(http_auth_level=func.AuthLevel.ANONYMOUS)
async def sleep_and_print(session: ClientSession, i: int) -> None:
await asyncio.sleep(5)
logging.info(f"Hello from {i}")
@app.route(route="http_trigger")
async def http_trigger(req: func.HttpRequest) -> func.HttpResponse:
async with ClientSession() as session:
await asyncio.gather(
*[
sleep_and_print(session, i)
for i in range(20)
],
return_exceptions=True
)
return func.HttpResponse(f"Hello,This HTTP triggered function executed successfully.")
from the worker side, i dont observe any blocking issues from timing out concurrent tasks. the api endpoints may suffer availiablity issue? Can you try some the code above? (sorry cant find any logs internally for the app https://getsentinelthreats.azurewebsites.net/ anymore as it has been too long)
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.
Hi team,
I am experiencing an issue where I am using asyncio.gather to await roughly 100 coroutines. These co routines do a simple api and then return a data frame. My code runs perfectly fine on my local environment but in my azure function about 10% of my coroutines return timeout errors (which I am using the default ClientSession() timeout of 5 minutes). These co routines are quick and only take about 10 seconds each.
The code for my function performing the actual https request is something like the following (this function is used in the coroutines)
Any advice or links to useful resources would be much appreciated.