We've recently identified a critical issue related to the worker environment reload and Azure Function initialization process. This issue has been going on for some time, and we wanted to bring it to your attention to make sure it's clear and resolved.
During the specialization call in consumption mode, the worker receives an environment reload request, which triggers various setup processes related to worker functionality and dependency isolation. However, if, for any reason, the host decides to restart the worker (e.g., due to a legitimate need like Timeouts or worker crashes due to OOM), the worker goes through the full language worker life cycle but does not receive an environment reload request. As a result, the application is not set up correctly because certain checks in the initialization request are skipped, assuming that the worker will receive a reload request in a consumption scenario. This primarily affects apps using dependency isolation (PYTHON_ISOLATE_WORKER_DEPENDENCIES to 1).
The result of the above behavior is a ModuleNotFound error hit in the function app, which could affect the app's availability. This issue doesn't affect Elastic Premium or Dedicated Function apps.
Expected behavior
The Python worker should correctly handle the restart case irrespective of the SKU being used.
Known workarounds
Provide a description of any known workarounds.
Restart the function app. This should reload all the modules needed.
Try to reduce the reasons workers could need a restart.
Move the app to Dedicated or EP for the time being.
Note that these are only temporary workarounds. This issue has been fixed and in the process of being deployed. Fix should be available in runtime version 4.28
Issue Background
We've recently identified a critical issue related to the worker environment reload and Azure Function initialization process. This issue has been going on for some time, and we wanted to bring it to your attention to make sure it's clear and resolved.
During the specialization call in consumption mode, the worker receives an environment reload request, which triggers various setup processes related to worker functionality and dependency isolation. However, if, for any reason, the host decides to restart the worker (e.g., due to a legitimate need like Timeouts or worker crashes due to OOM), the worker goes through the full language worker life cycle but does not receive an environment reload request. As a result, the application is not set up correctly because certain checks in the initialization request are skipped, assuming that the worker will receive a reload request in a consumption scenario. This primarily affects apps using dependency isolation (
PYTHON_ISOLATE_WORKER_DEPENDENCIES
to 1).The result of the above behavior is a
ModuleNotFound
error hit in the function app, which could affect the app's availability. This issue doesn't affect Elastic Premium or Dedicated Function apps.Expected behavior
The Python worker should correctly handle the restart case irrespective of the SKU being used.
Known workarounds
Provide a description of any known workarounds.
Note that these are only temporary workarounds. This issue has been fixed and in the process of being deployed. Fix should be available in runtime version 4.28
Related information
Provide any related information