taskiq-python / taskiq

Distributed task queue with full async support
MIT License
849 stars 52 forks source link

Issue with multiprocessing startup: can't run worker with `ddtrace-run` #184

Open MuriloScarpaSitonio opened 1 year ago

MuriloScarpaSitonio commented 1 year ago

I used to run the taskiq worker ... with ddtrace-run (i.e. ddtrace-run taskiq worker ...) but this is raising an error now:

Traceback (most recent call last):
  File "/Users/murilositonio/.pyenv/versions/remediation-automation-api-3.11/bin/taskiq", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/envs/remediation-automation-api-3.11/lib/python3.11/site-packages/taskiq/__main__.py", line 73, in main
    command.exec(sys.argv[1:])
  File "/Users/murilositonio/.pyenv/versions/3.11.3/envs/remediation-automation-api-3.11/lib/python3.11/site-packages/taskiq/cli/worker/cmd.py", line 26, in exec
    run_worker(wargs)
  File "/Users/murilositonio/.pyenv/versions/3.11.3/envs/remediation-automation-api-3.11/lib/python3.11/site-packages/taskiq/cli/worker/run.py", line 164, in run_worker
    set_start_method("fork")
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/context.py", line 247, in set_start_method
    raise RuntimeError('context has already been')
RuntimeError: context has already been set

I initially though this had something to do with #177, but if I comment that if condition (if platform == "darwin") it starts the workers, but with an error still:

[2023-07-28 11:33:57,731][taskiq.worker][INFO   ][MainProcess] Starting 2 worker processes.
[2023-07-28 11:33:57,747][taskiq.process-manager][INFO   ][MainProcess] Started process worker-0 with pid 22823 
[2023-07-28 11:33:57,753][taskiq.process-manager][INFO   ][MainProcess] Started process worker-1 with pid 22824 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 130, in _main
    self = reduction.pickle.load(from_parent)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 130, in _main
    self = reduction.pickle.load(from_parent)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory
[2023-07-28 11:33:59,967][taskiq.process-manager][INFO   ][MainProcess] worker-0 is dead. Scheduling reload.
[2023-07-28 11:33:59,967][taskiq.process-manager][INFO   ][MainProcess] worker-1 is dead. Scheduling reload.
[2023-07-28 11:34:00,975][taskiq.process-manager][INFO   ][MainProcess] Process worker-0 restarted with pid 22833
[2023-07-28 11:34:01,082][taskiq.process-manager][INFO   ][MainProcess] Process worker-1 restarted with pid 22834
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 130, in _main
    self = reduction.pickle.load(from_parent)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/spawn.py", line 130, in _main
    self = reduction.pickle.load(from_parent)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/murilositonio/.pyenv/versions/3.11.3/lib/python3.11/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory

If I rollback to an older version (0.6.0, for instance) it works. It also works for scheduler.

Do you know what it can be?

MuriloScarpaSitonio commented 1 year ago

Btw this works fine if I spawn some containers. Definitely a problem with darwin

s3rius commented 1 year ago

Hi!Thought all darwin problems were fixed in https://github.com/taskiq-python/taskiq/commit/4f9f1ae2f2b5181850fa85672e82517070d8ae13.

I need to take a deeper look at ddtrace. Also, can you please provide me with information about your machine?

MuriloScarpaSitonio commented 1 year ago

Hi @s3rius, I'm using a Macbook Pro 16 with Ventura 13.5 (not sure if you'd like to see any other specific info tho)

s3rius commented 1 year ago

@MuriloScarpaSitonio, hi. I was unable to test your case on macos, but can you please try changing this line to

        set_start_method("fork", force=True)

Because on linux I couldn't reproduce this error even with ddtrace.