Introducing Task Run Locking for Enhanced Concurrency Control in Gokart
tl;dr
Introduces task run locking in Gokart for better concurrency control.
Prevents redundant task executions in distributed setups.
Updates and adds documentation for efficient multi-worker execution.
Implements backoff strategies for handling task lock exceptions.
Enhances efficiency and reliability of task execution in Gokart.
Summary
This pull request introduces significant updates aimed at enhancing the efficiency and reliability of running tasks on multiple workers in a Gokart/Luigi pipeline. Specifically, it adds new documentation on efficient multi-worker execution, updates task conflict prevention mechanisms, and integrates backoff strategies for handling task lock exceptions. These changes are designed to prevent redundant task executions and ensure more robust task locking in distributed environments.
Changes
Documentation Addition: Added a new documentation file efficient_run_on_multi_workers.rst that guides users on how to improve efficiency when running similar Gokart pipelines on multiple workers. This includes strategies to skip completed tasks and suppress the execution of tasks already being run by another worker.
Documentation Update: Updated the index.rst to include the new documentation in the User Guide section.
Task Conflict Prevention Lock: Renamed using_task_cache_collision_lock.rst to using_task_task_conflict_prevention_lock.rst to better reflect the mechanism's purpose. The documentation within has also been updated to align with the new naming convention and clarify the prevention of task cache conflicts.
Code Enhancements:
Modified gokart/build.py to include backoff strategies when encountering TaskLockException, allowing for automatic retrying with exponential backoff until a maximum number of tries or wait time is reached.
Updated task_lock.py and task_lock_wrappers.py to support the new locking mechanism during task execution (run method), ensuring that tasks are not executed redundantly across workers.
Added a new module wrap_run_with_lock.py to facilitate wrapping the task's run method with a lock, preventing simultaneous execution of the same task by multiple workers.
Adjusted gokart/task.py to automatically apply run locking based on task configuration, enhancing task execution efficiency in distributed environments.
Dependency Addition: Added backoff library to pyproject.toml and updated poetry.lock accordingly. This library is utilized to implement exponential backoff strategy when handling task lock exceptions.
Impact
Efficiency: These changes significantly reduce redundant task executions in distributed environments, lowering compute resource wastage.
Reliability: Enhances the reliability of task execution in concurrent scenarios by preventing task cache conflicts and ensuring that tasks are not executed more than necessary.
Usability: The addition of documentation provides clear guidance to users on how to leverage these new features, improving the overall usability of Gokart for distributed task execution.
Testing
Updated existing tests to reflect changes in task locking mechanism.
Added new tests to cover the functionality of retrying task execution with exponential backoff upon encountering lock exceptions.
Documentation
Added comprehensive documentation on efficient execution strategies on multiple workers.
Updated existing documentation to reflect the renaming and functionality changes in task conflict prevention.
Introducing Task Run Locking for Enhanced Concurrency Control in Gokart
tl;dr
Summary
This pull request introduces significant updates aimed at enhancing the efficiency and reliability of running tasks on multiple workers in a Gokart/Luigi pipeline. Specifically, it adds new documentation on efficient multi-worker execution, updates task conflict prevention mechanisms, and integrates backoff strategies for handling task lock exceptions. These changes are designed to prevent redundant task executions and ensure more robust task locking in distributed environments.
Changes
Documentation Addition: Added a new documentation file
efficient_run_on_multi_workers.rst
that guides users on how to improve efficiency when running similar Gokart pipelines on multiple workers. This includes strategies to skip completed tasks and suppress the execution of tasks already being run by another worker.Documentation Update: Updated the
index.rst
to include the new documentation in the User Guide section.Task Conflict Prevention Lock: Renamed
using_task_cache_collision_lock.rst
tousing_task_task_conflict_prevention_lock.rst
to better reflect the mechanism's purpose. The documentation within has also been updated to align with the new naming convention and clarify the prevention of task cache conflicts.Code Enhancements:
gokart/build.py
to include backoff strategies when encounteringTaskLockException
, allowing for automatic retrying with exponential backoff until a maximum number of tries or wait time is reached.task_lock.py
andtask_lock_wrappers.py
to support the new locking mechanism during task execution (run
method), ensuring that tasks are not executed redundantly across workers.wrap_run_with_lock.py
to facilitate wrapping the task'srun
method with a lock, preventing simultaneous execution of the same task by multiple workers.gokart/task.py
to automatically apply run locking based on task configuration, enhancing task execution efficiency in distributed environments.Dependency Addition: Added
backoff
library topyproject.toml
and updatedpoetry.lock
accordingly. This library is utilized to implement exponential backoff strategy when handling task lock exceptions.Impact
Testing
Documentation