Kaszanas / SC2_Datasets

https://sc2-datasets.readthedocs.io/
GNU General Public License v3.0
8 stars 3 forks source link

Increasing num_worker crashes the program #11

Closed Kaszanas closed 2 years ago

Kaszanas commented 2 years ago

While PyTorch Lightning suggested increasing the num_worker parameters. it seems that there is some issues with doing so.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\spawn.py", line 116, in spawn_main    
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\spawn.py", line 236, in prepare       
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\runpy.py", line 269, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "d:\Projects\SC2EGSet_Experiments\src\experiments\logistic_regression.py", line 53, in <module>
    trainer.fit(model=logistic_regression, datamodule=datamodule)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit
    self._call_and_handle_interrupt(
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1199, in _run
    self._dispatch()
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1279, in _dispatch
    self.training_type_plugin.start_training(self)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 202, in start_training
    self._results = trainer.run_stage()
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1289, in run_stage
    return self._run_train()
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1311, in _run_train
    self._run_sanity_check(self.lightning_module)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1375, in _run_sanity_check
    self._evaluation_loop.run()
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 110, in advance
    dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\base.py", line 140, in run
    self.on_run_start(*args, **kwargs)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 86, in on_run_start
    self._dataloader_iter = _update_dataloader_iter(data_fetcher, self.batch_progress.current.ready)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\utilities.py", line 121, in _update_dataloader_iter
    dataloader_iter = enumerate(data_fetcher, batch_idx)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\utilities\fetching.py", line 197, in __iter__
    self.dataloader_iter = iter(self.dataloader)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\dataloader.py", line 368, in __iter__
    return self._get_iterator()
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\dataloader.py", line 314, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "d:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\dataloader.py", line 927, in __init__
    w.start()
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\process.py", line 121, in start       
    self._popen = self._Popen(self)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\context.py", line 224, in _Popen      
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\context.py", line 327, in _Popen      
    return Popen(process_obj)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

fyi: @leafnode

Kaszanas commented 2 years ago

This seems to be fixed now by introducing the if __name__ == "__main__": guard. I think this is required because then it is not possible to spawn processes recursively?

Kaszanas commented 2 years ago

This is fixed.