sematic-ai / sematic

An open-source ML pipeline development platform
Other
969 stars 58 forks source link

Improve parallelism for deeply nested parallel branches #1121

Closed augray closed 5 months ago

augray commented 5 months ago

Closes #1120

In certain cases, Sematic could get "stuck" waiting for something on one branch of a pipeline when more work could be done in other branches. This PR makes it so that Sematic will avoid entering a wait until changes are no longer being made to the pipeline state. Essentially the bug was that it can take a few iterations through the state-evolution loop before all the changes have taken effect that allow us to know it's OK to schedule a particular run. The fix was to look at whether the state-evolution loop resulted in anything in the pipeline changing state. And if it did, don't enter the wait.

Testing

Set up a pipeline that I knew would hit a condition that would wait prematurely before the fix. The sleep in the left branch of the image takes seconds, while the corresponding sleep in the right branch was configured to take hours.

Before, get stuck here until right sleep finishes:

Screenshot 2024-04-15 at 8 18 18 AM

After, progress as far as possible before waiting for the right, slow branch:

Screenshot 2024-04-15 at 8 31 42 AM