Azure / azure-functions-durable-extension

Durable Task Framework extension for Azure Functions
MIT License
711 stars 263 forks source link

Correctly propagate orchestration results of a failed orchestration #2748

Closed sebastianburckhardt closed 3 months ago

sebastianburckhardt commented 4 months ago

As diagnosed in #2743, the current code in OutOfProcMiddleware.CallOrchestratorAsync does not correctly propagate the results of a failed orchestration to the backend.

The problem is that when an orchestrator function fails, the OrchestratorExecutionResult that was constructed by the executor is ignored, and instead an alternate OrchestratorExecutionResult.ForFailure is constructed from the exception returned by the function result. This is a problem because

  1. the original failure details are lost instead of being persisted in the history.
  2. any orchestrator actions that took place prior to the failure (e.g. sending lock release messages) are ignored.

This PR fixes the problem by propagating the original OrchestratorExecutionResult that describes the failure (if there is one).

lilyjma commented 3 months ago

@cgillum / @jviau - requested for your review because there are some customers asking about this. There's a DF release starting next week, so it'd be great if this fix could go out with that.

sebastianburckhardt commented 3 months ago

Is there an orchestration that you used for testing this fix which we can borrow and include as part of an end-to-end test?

Yes. All the tests are in here: https://github.com/Azure/azure-functions-durable-extension/pull/2612 The specific test that would have detected this problem is called FaultyCriticalSection.