iris-hep / analysis-grand-challenge

Repository dedicated to AGC preparations & execution
https://agc.readthedocs.io
MIT License
24 stars 39 forks source link

Local execution of the `.py` version of the ttbar analysis is broken #143

Open eguiraud opened 1 year ago

eguiraud commented 1 year ago

I switched to local execution by setting AF: local in config.yaml. With that change, python ttbar_analysis_pipeline.py is not able to run. It seems that the problem is linked to running the backend "at global scope": this patch that puts all of the data processing under if __name__ == "__main__" fixes the problem.

The actual error:

    raise RuntimeError('''
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

with this traceback:

Traceback (most recent call last):
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/site-packages/distributed/nanny.py", line 442, in instantiate
    result = await self.process.start()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/site-packages/distributed/nanny.py", line 711, in start
    await self.process.start()
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/site-packages/distributed/process.py", line 55, in _call_and_set_future
    res = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/site-packages/distributed/process.py", line 215, in _start
    process.start()
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/multiprocessing/spawn.py", line 158, in get_preparation_data
    _check_not_importing_main()
  File "/home/blue/Tools/miniconda3/envs/agc-py311/lib/python3.11/multiprocessing/spawn.py", line 138, in _check_not_importing_main
eguiraud commented 1 year ago

(the patch is probably too intrusive and it does not really make sense for the notebook version of the script, so I'm not sure how to proceed -- there might be a better fix)

alexander-held commented 1 year ago

Thanks for raising this, we definitely need to find a better version that works with both notebook and script.

Does that patch work in the notebook as-is? I didn't think that the if __name__ == "__main__": block could continue across cells. I was hoping that wrapping the functionality in run_processor() is already enough by itself, does that fix the problem without the __main__ guard? If so, that would be easiest. If not, then I'm honestly not sure what to best do here.

eguiraud commented 1 year ago

Unfortunately just wrapping the processing in run_processor() did not seem to help (but as I don't actually understand the underlying issue it could be that I missed something simple).

I updated the branch with the patch to a simpler version that just uses if __name__ == "__main__" to avoid the confusion.

EDIT:

here's the simpler patch

eguiraud commented 1 year ago

More info at https://docs.dask.org/en/stable/scheduling.html#standalone-python-scripts

alexander-held commented 1 year ago

Thinking about this some more: I imagine it is fine to only guard the cell (or even just the commands) where the coffea execution happens and then go back, like this:


...  # all the other code

if __name__ == "__main__":
    run.preprocess(fileset, ...)
    run(fileset, ...)

...  # all the rest

It is perhaps a bit unusual but this might be a minimally invasive solution, assuming it works as I imagine it does.