graphnet-team / graphnet

A Deep learning library for neutrino telescopes
https://graphnet-team.github.io/graphnet/
Apache License 2.0
94 stars 94 forks source link

Edge case for processing 1 file when >1 workers are provided #773

Open pweigel opened 4 days ago

pweigel commented 4 days ago

Describe the bug There seems to be a weird edge case when processing single file "datasets" using more than one worker. I guess this is because https://github.com/graphnet-team/graphnet/blob/65ec11c8ee5f5b8448343a2254df9d6a9bc6187c/src/graphnet/data/dataconverter.py#L277-L293 is setting n_workers = 1 when there is one file and does not use multiprocessing, but https://github.com/graphnet-team/graphnet/blob/65ec11c8ee5f5b8448343a2254df9d6a9bc6187c/src/graphnet/data/dataconverter.py#L259-L263 uses self._num_workers and tries to access the global variables that are used for multiprocessing.

To Reproduce Steps to reproduce the behavior:

  1. Process i3 files using >1 workers and only one file in the input folder

Expected behavior It should allocate just one worker and be processed normally.

Full traceback

File "<path>/graphnet/src/graphnet/data/dataconverter.py", line 260, in _request_event_nos

    with global_index.get_lock():  # type: ignore[name-defined]
         ^^^^^^^^^^^^
NameError: name 'global_index' is not defined. Did you mean: 'init_global_index'?