Something wrong about multiprocessing

psychopa4 commented 6 years ago

I got a prolem while training. It seems that something went wrong while using multi-process for produing training data.

Preparing loss function... [{'type': 'L1', 'weight': 1.0, 'function': L1Loss( )}] [Epoch 1] Learning rate: 1.00e-3 Traceback (most recent call last): Traceback (most recent call last): File "", line 1, in File "main.py", line 17, in File "d:\Anaconda3\envs\pyth\lib\multiprocessing\spawn.py", line 105, in spawn_main t.train() exitcode = _main(fd) File "F:\YiPeng\py\EDSR-PyTorch-master\code\trainer.py", line 48, in train File "d:\Anaconda3\envs\pyth\lib\multiprocessing\spawn.py", line 114, in _main for batch, (input, target, idx_scale) in enumerate(self.loader_train): prepare(preparation_data) File "F:\YiPeng\py\EDSR-PyTorch-master\code\dataloader.py", line 133, in iter File "d:\Anaconda3\envs\pyth\lib\multiprocessing\spawn.py", line 225, in prepare return MSDataLoaderIter(self) _fixup_main_from_path(data['init_main_from_path']) File "F:\YiPeng\py\EDSR-PyTorch-master\code\dataloader.py", line 106, in init File "d:\Anaconda3\envs\pyth\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path w.start() run_name="mp_main") File "d:\Anaconda3\envs\pyth\lib\multiprocessing\process.py", line 105, in start File "d:\Anaconda3\envs\pyth\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "d:\Anaconda3\envs\pyth\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "d:\Anaconda3\envs\pyth\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\YiPeng\py\EDSR-PyTorch-master\code\main.py", line 17, in t.train() File "F:\YiPeng\py\EDSR-PyTorch-master\code\trainer.py", line 48, in train for batch, (input, target, idx_scale) in enumerate(self.loader_train): File "F:\YiPeng\py\EDSR-PyTorch-master\code\dataloader.py", line 133, in iter return MSDataLoaderIter(self) File "F:\YiPeng\py\EDSR-PyTorch-master\code\dataloader.py", line 106, in init__ w.start() File "d:\Anaconda3\envs\pyth\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) self._popen = self._Popen(self) File "d:\Anaconda3\envs\pyth\lib\multiprocessing\context.py", line 223, in _Popen File "d:\Anaconda3\envs\pyth\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) return _default_context.get_context().Process._Popen(process_obj) File "d:\Anaconda3\envs\pyth\lib\multiprocessing\context.py", line 322, in _Popen File "d:\Anaconda3\envs\pyth\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "d:\Anaconda3\envs\pyth\lib\multiprocessing\popen_spawn_win32.py", line 65, in init__ return Popen(process_obj) File "d:\Anaconda3\envs\pyth\lib\multiprocessing\popen_spawn_win32.py", line 33, in init reduction.dump(process_obj, to_child) prep_data = spawn.get_preparation_data(process_obj._name) File "d:\Anaconda3\envs\pyth\lib\multiprocessing\reduction.py", line 60, in dump File "d:\Anaconda3\envs\pyth\lib\multiprocessing\spawn.py", line 143, in get_preparation_data _check_not_importing_main() File "d:\Anaconda3\envs\pyth\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main is not going to be frozen to produce an executable.''') RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
ForkingPickler(file, protocol).dump(obj)

BrokenPipeError: [Errno 32] Broken pipe

sanghyun-son commented 6 years ago

Hello.

Maybe there are some conflicts between PyTorch core and my code.

Would you explain your environment in detail? (ex. PyTorch version, Python version, your OS)

Thank you.

psychopa4 commented 6 years ago

Python 3.6.4, PyTorch 0.3.1. It seems to be the OS, multiprocessing cannot function normally on Win10. @thstkdgus35

psychopa4 commented 6 years ago

I also wonder whether you add the bicubic upsampling counterpart of LR to the output of the network, as you know, this strategy is universally adopted, but i didn't see it in your code. @thstkdgus35

sanghyun-son commented 6 years ago

Our method do not add bicubic upsampled low-resolution image to network output.

There are several reasons, but we found that this approach yields better results.

sanghyun-son / EDSR-PyTorch

Something wrong about multiprocessing #20