Closed ajstewart closed 3 months ago
@ajstewart I see that you have enabled the FUNCTION_ISOLATE_WORKER_DEPENDENCIES flag. Do you need this flag enabled? If not, could you disable this flag and try again?
Thanks for the reply, ah yes from my other issue (#1339) I know that can spell trouble.
I'll turn it off and get back to you, I'll need to build up some events to process.
Ok so I repeated the test without PYTHON_ISOLATE_WORKER_DEPENDENCIES
and indeed this has stopped these particular errors. I'm only seeing the function timeouts and 502 from my other service.
Should this setting come with bit more of a warning? Like with the other issue I linked to I seem to run into a lot of problems because of it, when originally in the documentation it made it sound like it should avoid such conflicts. But it just seems to create more.
I'll be conscious to avoid this setting in the future.
I'm reporting this issue as something seems wrong in terms of the actual starting of the function. If the error was true all the time the function would never run, where as it does complete successfully when the load is light.
I've recently updated the Function to 3.11 as well so not sure if it's on that version or others.
Investigative information
1d160805-1108-481f-acb9-96e738070137
Repro steps
Unfortunately I can only reproduce by putting my particular function under load.
The function (which is an experimental one) contains a call that can suffer from long response times (~3 mins sometimes) and 502 gateway errors under load. I'm not sure if this is masking or causing the error that I'm reporting here, though I suspect they are linked given what I'm seeing.
Expected behavior
I expect the function to load normally as it does under light load. Or fail with raised HTTP response or timeout.
Actual behavior
Under a load of about 1000 queue messages to churn through (a typical response should take around 20-30 seconds) I am transiently getting the error below. It seems pretty fundamental and perhaps points towards a new instance failing to start correctly? Most of the requests do go through ok, but I have to re-send through some messages from the poison queue for processing to finish on all of them.
When the load is light I do not encounter any errors.
The function is not optimised or ideal but I still expect the load to be processed given that each task should only fail with bad gateways or maybe timeouts at a stretch - maybe this is masking that and the error is a red herring but I thought I would report it in any case.
Known workarounds
Don't subject the function to high loads.
Contents of the requirements.txt file:
Don't want to share but invocation id is above.