Open AartBluestoke opened 4 years ago
cc @mathewc as FYI
Moving issue to Triaged milestone.
Hi, I have just seen this again, i have a function exploding with 2GB of ram usage (then a crash) every 5 minutes at the moment. This triaged, above. Is there a timeline for any next steps? Thanks, Andrew.
When there are many (>1 million/hour) blob writes, and a blob trigger anywhere on the same storage account (even on a different container) job hosts with limited memory can crash, due the log watcher materializing large arrays.
Repro steps
Provide the steps required to reproduce the problem
have some code that writes many blobs (following code is hacked together to simulate the failure conditions similer to what was observed in production: warning, could run up a large bill by doing all the blob writes) https://gist.github.com/AartBluestoke/48115e7a80ac1df2b8360af0d58948b9
in a different azure function, on a different function host have a blob trigger somewhere in the storage account.
Expected behavior
Code runs as normal where 1 function writing lots of blobs doesn't negatively impact other functions (other than directly requested work)
Actual behavior
function crashes with out of memory - an analysis of a crash dump memory snapshot shows almost all memory used by 800,000 blobs being held within 2 arrays within the "BlobLogListener.GetRecentWritesAsync"
https://github.com/Azure/azure-webjobs-sdk/blob/85d463faa28790d72f0cda8f00b95db1030ba7b0/src/Microsoft.Azure.WebJobs.Extensions.Storage/Blobs/Listeners/PollLogsStrategy.cs#L124 materializes the enumerable of all recent changes for that container into a single array, even if there is no BlobTrigger attached to that container
https://github.com/Azure/azure-webjobs-sdk/blob/c9d92b2c271e1f4bd8120fc2f7b6cea5a50289a7/src/Microsoft.Azure.WebJobs.Extensions.Storage/Blobs/Listeners/BlobLogListener.cs#L55 will also materialize a blob list of all blobs modified within the threshold (the comment indicates 2 hours).
A) materializing all responses from a batch interface into a single list is not good practice. B) combining the two behaviors above means that you read a (large) list into 1 array, then re-group and re-materialize the list into a second array.
Known workarounds
None. Do not use a blob trigger on any containers in a storage account that has a high blob write volume.
Related information
Further discussion with azure support staff on ticket 120081723001883 Full memory dump available on request.