Open hamersaw opened 2 years ago
Hello π, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! π
Hello π, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! π
Hello π, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! π
Motivation: Why do you think this is important?
Currently instances of the web api sync plugin will block FlytePropeller workers during execution. This means that during scenarios of heavy load, for either expensive calls or a large number of calls, we effective starve FlytePropeller workers. As a result other workflows are unable to progress.
Goal: What should the final outcome look like, ideally?
We propose to execute instances of the web api sync plugin asychronously. This paradigm can emulate the current webapi async plugin scheme; where plugins have a
Get
function that retrieves status and is called in the background to populate a global cache, and aStatus
call which is called by FlytePropeller to check the status of that cache. We can use goroutines to effectively execute synchronous web api calls in the background and check the status of those calls with a lightweight lookup into a cache. This will ensure FlytePropeller workers maintain their lightweight design and reduce starvation under heavy load.Describe alternatives you've considered
Restricting the maximum number of nodes processed in a FlytePropeller round has been proposed. However, this still means that execution of the web api sync plugin is in the critical path. Additionally, a worker is unable to parallelize these operations, meaning that workflow evaluation is essentially time-division multiplexing.
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?