Multiprocessing error on Windows

sjscotti commented 5 years ago

Hi! I am anxious to experiment with your RBPN code. I have downloaded onto a Windows 10 machine with a GPU, and installed Python 3.5, PyTorch 1.0.1, and Pyflow dependencies (plus any lower order functions that were missing). In running the eval.py script, I get the following error (I needed to interrupt the stalled process at the end)...

Namespace(chop_forward=False, data_dir='./Vid4', file_list='foliage.txt', future_frame=True, gpu_mode=True, gpus=1, model='weights/RBPN_4x.pth', model_type='RBPN', nFrames=7, other_dataset=True, output='Results/', residual=False, seed=123, testBatchSize=1, threads=1, upscale_factor=4) ===> Loading datasets ===> Building model RBPN Pre-trained SR model is loaded. Namespace(chop_forward=False, data_dir='./Vid4', file_list='foliage.txt', future_frame=True, gpu_mode=True, gpus=1, model='weights/RBPN_4x.pth', model_type='RBPN', nFrames=7, other_dataset=True, output='Results/', residual=False, seed=123, testBatchSize=1, threads=1, upscale_factor=4) ===> Loading datasets ===> Building model RBPN Pre-trained SR model is loaded. Traceback (most recent call last): File "", line 1, in File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 106, in spawn_main exitcode = _main(fd) File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 115, in _main prepare(preparation_data) File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 226, in prepare _fixup_main_from_path(data['init_main_from_path']) File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 278, in _fixup_main_from_path run_name="mp_main") File "C:\Program Files\Python35\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "C:\Program Files\Python35\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "C:\Program Files\Python35\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\Steve\Downloads\RBPN-PyTorch-master\RBPN-PyTorch-master\eval.py", line 182, in eval() File "C:\Users\Steve\Downloads\RBPN-PyTorch-master\RBPN-PyTorch-master\eval.py", line 79, in eval for batch in testing_data_loader: File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter__ return _DataLoaderIter(self) File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init w.start() File "C:\Program Files\Python35\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "C:\Program Files\Python35\lib\multiprocessing\context.py", line 212, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Program Files\Python35\lib\multiprocessing\context.py", line 313, in _Popen return Popen(process_obj) File "C:\Program Files\Python35\lib\multiprocessing\popen_spawn_win32.py", line 34, in init__ prep_data = spawn.get_preparation_data(process_obj._name) File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 144, in get_preparation_data _check_not_importing_main() File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 137, in _check_not_importing_main is not going to be frozen to produce an executable.''') RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Traceback (most recent call last): File "eval.py", line 182, in eval() File "eval.py", line 79, in eval for batch in testing_data_loader: File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 631, in next idx, batch = self._get_batch() File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 610, in _get_batch return self.data_queue.get() File "C:\Program Files\Python35\lib\multiprocessing\queues.py", line 94, in get res = self._recv_bytes() File "C:\Program Files\Python35\lib\multiprocessing\connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "C:\Program Files\Python35\lib\multiprocessing\connection.py", line 306, in _recv_bytes [ov.event], False, INFINITE) KeyboardInterrupt

I am not very familiar with Python, but from trying to understand this error, it appears that it may be unique to Windows because of the way it spawns off processes compared to Linux. See note under torch.utils.data.DataLoader here...

https://pytorch.org/docs/stable/data.html?highlight=dataloader%20py#torch.utils.data.DataLoader

The suggested correction is to include checks using if __name == '__main': in the eval.py code at the appropriate places. When that is done correctly, the code should work correctly on both Windows and Linux. I experimented with adding this check to several locations in the code, but was unsuccessful to get a complete run. Can you suggest how to modify the code to work on Windows or another change that would allow it to run?

Thanks -Steve

sjscotti commented 5 years ago

Hi! I found that this problem was easy to fix. You just took the couple of lines in eval.py and replaced it with this:

Eval Start!!!!

if name == 'main': eval()

I have another question that I will ask in another thread.

CybotDNA commented 2 years ago

try the following "eval.py":

def eval():
    model.eval()
    count=1
    avg_psnr_predicted = 0.0
    for batch in testing_data_loader:
        input, target, neigbor, flow, bicubic = batch[0], batch[1], batch[2], batch[3], batch[4]

if __name__ == '__main__':  

        with torch.no_grad():

alterzero / RBPN-PyTorch

Multiprocessing error on Windows #42

Eval Start!!!!