boomerang-io / community

The Boomerang community, roadmap, planning, and architecture repository. The central place for information on joining, contributing, and governance.
https://useboomerang.io
Apache License 2.0
11 stars 0 forks source link

Engine - Status Update Collision in Aysnchronous Task updates #376

Closed tlawrie closed 4 months ago

tlawrie commented 1 year ago

Ran into some performance issues in the engine. Due to our now distributed nature of task execution, there was the possibility that a Handler could report back and ask for End task, before our Start task had finished processing the start. :disappointed:

Example In about 46 milliseconds, the Handler would complete their task, which was faster than our system processing the task. Commence Start -> [2023-02-14 04:55:12.973] Commence End -> [2023-02-14 04:55:13.013] Finish End -> [2023-02-14 04:55:13.024] Finish Start -> [2023-02-14 04:55:13.094]

tlawrie commented 1 year ago

I've implemented a few TaskRun locks the same as WorkflowRun locks and waiting to see if that fixes the problem. Alternatively, I can remove the canCompleteTask check from Start and instead move it to the Queue. (looking into that too).

tlawrie commented 1 year ago

Essentially, If an external handler takes a queued task, sends back the start task (api) and then 120 milliseconds later ends the task (api) somehow the end task ends before the start task saves. This then cause the start to save over the top of the end. I tried task level locks but that still didn’t work, end was faster than start. Any thoughts on how I fix?

tlawrie commented 1 year ago

I think the problem with (1) and probably all the options is I have no way of knowing which order the request is. Because going from Ready to Running and Ready to Completed are both valid. I.e. you can skip the Running and fail the task.

tlawrie commented 1 year ago

Decision: move the start and end into non async as they are external influence i.e. API calls and need to update the status and phase asyncrhonously. The System Tasks will be a new async executeSystemTask method and the finish / executeNextSteps methods will move to async for the end. SImilar to queue workflow is synchronous vs start workflow being async.