pyinfra-dev / pyinfra

pyinfra turns Python code into shell commands and runs them on your servers. Execute ad-hoc commands and write declarative operations. Target SSH servers, local machine and Docker containers. Fast and scales from one server to thousands.
https://pyinfra.com
MIT License
3.85k stars 374 forks source link

Wrong operation order when using preserve_loop_order with a "large" number of hosts #816

Closed julienlavergne closed 2 years ago

julienlavergne commented 2 years ago

Describe the bug

When using enough hosts and operations, there is a race conditions that caused the operations to be executed in an unpredictable order. I was able to have a minimum working example extracted from a bigger task that can consistently reproduce the issue.

To Reproduce

insert some random sleep in the script to simulate some python execution time. Have enough hosts, but 2 may be enough to reproduce the issue.

from random import randint
from time import sleep

from pyinfra import state
from pyinfra.operations import server

with state.preserve_loop_order([1, 2, 3, 4, 5, 6, 7, 8, 9]) as loop_items:
    for i in loop_items():
        sleep(randint(0, 2))
        server.shell(f"echo {i}")

Operation order:

Output in gist because it is too long

Expected behavior

I would expect all hosts to echo the digits in correct order. And since the current implementation synchronize the operations on all hosts, I would expect the output to only contains 9 operations and executed each of the operations on each host.

Meta

v2.1

Fizzadar commented 2 years ago

Nice catch! This must be a combination of parallel op generation x the preserve loop function.

Fizzadar commented 2 years ago

Now fixed as of v2.2!