Azure / azure-functions-durable-extension

Durable Task Framework extension for Azure Functions
MIT License
710 stars 263 forks source link

RpcException "ResourceExhausted" while attempting to read orchestrator output if the input is greater than 4MB #2831

Open DCorrBECare opened 1 month ago

DCorrBECare commented 1 month ago

Description

I'm scheduling an orchestrator with an input bigger than 4MB (which is now possible thanks to #2778). Everything works as expected except when i try to read the orchestrator's output.

Expected behavior

I can read the orchestrator's output without an exception being thrown.

Actual behavior

Exception: Grpc.Core.RpcException: Status(StatusCode="ResourceExhausted", Detail="Received message exceeds the maximum configured message size.")

Relevant source code snippets

FunctionApp1.zip

Known workarounds

No known workarounds

App Details

Screenshots

https://github.com/Azure/azure-functions-durable-extension/assets/97175816/8555dc6f-1c03-42e7-9c6f-fbc611a9e208

DCorrBECare commented 2 weeks ago

Hi team, any update on this?

davidmrdavid commented 2 weeks ago

Sorry for the delay here. This seems to be an oversight in the GetInstances API, which is probably making a GRPC call with some default size limit somewhere.

cc/ @jviau in case he's able to easily find the source code where this config can be specified. If this is an easy fix, there's no reason why we couldn't sneak in that fix in the coming weeks.

@DCorrBECare: are you able to work around this issue by not passing this "large" data directly, instead utilizing a claim-check pattern (i.e passing a lightweight identifier that you can use to download the data when needed)? Just thinking of workarounds as it may take us a bit to prioritize this, especially given that it is a DF anti-pattern to directly deal with inputs that are MBs large.

jviau commented 2 weeks ago

@DCorrBECare can you provide a minimal repro?

Even though we set the max message length to int.MaxValue, this doesn't mean you can use inputs/outputs of arbitrary size. These are still network calls that have constraints out of our control applied to them.

jviau commented 2 weeks ago

Ah okay - we set the increased limit WebJobs side, but we have not increased the channel limit worker side. The work is to configure channel options in these two files:

https://github.com/Azure/azure-functions-durable-extension/blob/dev/src/Worker.Extensions.DurableTask/FunctionsDurableClientProvider.net6.cs https://github.com/Azure/azure-functions-durable-extension/blob/dev/src/Worker.Extensions.DurableTask/FunctionsDurableClientProvider.netstandard.cs

DCorrBECare commented 2 weeks ago

Hi @jviau @davidmrdavid, thank you for your feedback.

Yes I am able to workaround this issue with a claim-check pattern, that's what i have been doing till this time.

I am reading the output of the orchestrator inside an HTTP function, the output size doesn't really matter if the input is greater than 4MB. In the repro I attached in the issue the output is only one char.