The current implementation of the flow service calls the Functions platform synchronously to invoke a function then waits for the response. This issue discusses whether that is the best strategy or if an alternative would be better.
The client in flow-service holds open a connection to the Functions platform for the duration of the invocation
Mechanisms of woe
intermediate network interruption terminates the connection before the response is completely received
timeout of the TCP connection
Flow service dies/is re-scheduled...
Woe is us
In all cases, the Completion Stage has to be considered as failed, not-retryable.
Users can recover from this to a certain extend with error handling in flow but they will not know if the original task has finished or not - from flows point of view it is essentially lost.
Alternative proposal
The call to the Functions platform can be treated as an asynchronous invocation. This means (AIUI) that the call returns immediately with an ID which can then be queried for completion status and result. flow would then have to persist the fact that the async invocation has been started along with the ID, and the FDK would need to push the result back to the flow service on completion
Pros:
is more resilient in case of network interruption, ie mechanisms 1 & 2
Fewer timeout issues as async calls can queue
Cons:
New event path for async results
fn platform may take an unbounded amount of time before starting the execution
Not clear what to do if the invoke-async call to Functions errors
The current implementation of the flow service calls the Functions platform synchronously to invoke a function then waits for the response. This issue discusses whether that is the best strategy or if an alternative would be better.
The client in flow-service holds open a connection to the Functions platform for the duration of the invocation
Mechanisms of woe
Woe is us
In all cases, the Completion Stage has to be considered as failed, not-retryable.
Users can recover from this to a certain extend with error handling in flow but they will not know if the original task has finished or not - from flows point of view it is essentially lost.
Alternative proposal
The call to the Functions platform can be treated as an asynchronous invocation. This means (AIUI) that the call returns immediately with an ID which can then be queried for completion status and result. flow would then have to persist the fact that the async invocation has been started along with the ID, and the FDK would need to push the result back to the flow service on completion
Pros:
Cons: