sczhou / ProPainter

[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
https://shangchenzhou.com/projects/ProPainter/
Other
5.45k stars 645 forks source link

FileNotFoundError: [Errno 2] No such file or directory: '/mnt/lustre/sczhou/VQGANs/CodeMOVI/experiments_model/recurrent_flow_completion_v5_train_flowcomp_v5/gen_760000.pth' #71

Closed mmehedin closed 5 months ago

mmehedin commented 7 months ago

Hi, Ran into a small problem:

FileNotFoundError: [Errno 2] No such file or directory: '/mnt/lustre/sczhou/VQGANs/CodeMOVI/experiments_model/recurrent_flow_completion_v5_train_flowcomp_v5/gen_760000.pth'

This was generated by running:

python train.py -c configs/train_propainter.json

in the file ProPainter/core/trainer.py

The complete output looks like this:

` python train.py -c configs/train_propainter.json world_size: 4 using GPU 0-0 for training [**] create folder experiments_model/propainter_trainpropainter ./datasets/davis/train.json using GPU 1-1 for training ./datasets/davis/train.json using GPU 3-3 for training ./datasets/davis/train.json using GPU 2-2 for training ./datasets/davis/train.json Pretrained flow completion model has loaded... Pretrained flow completion model has loaded... Pretrained flow completion model has loaded... Pretrained flow completion model has loaded... Traceback (most recent call last): File "/home/jovyan/work/flowguidedtransformer/ProPainter_FO/train.py", line 105, in mp.spawn(main_worker, nprocs=torch.cuda.device_count(), args=(config, )) File "/opt/conda/envs/fgt/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 239, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/opt/conda/envs/fgt/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 197, in start_processes while not context.join(): File "/opt/conda/envs/fgt/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/opt/conda/envs/fgt/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 69, in wrap fn(i, *args) File "/home/jovyan/work/flowguidedtransformer/ProPainter_FO/train.py", line 74, in main_worker trainer = core.dict[trainerversion].dict'Trainer' File "/home/jovyan/work/flowguidedtransformer/ProPainter_FO/core/trainer.py", line 75, in init self.fix_flow_complete = RecurrentFlowCompleteNet('/mnt/lustre/sczhou/VQGANs/CodeMOVI/experiments_model/recurrent_flow_completion_v5_train_flowcomp_v5/gen760000.pth') File "/home/jovyan/work/flowguidedtransformer/ProPainter_FO/model/recurrent_flow_completion.py", line 268, in init ckpt = torch.load(model_path, map_location='cpu') File "/opt/conda/envs/fgt/lib/python3.10/site-packages/torch/serialization.py", line 791, in load with _open_file_like(f, 'rb') as opened_file: File "/opt/conda/envs/fgt/lib/python3.10/site-packages/torch/serialization.py", line 271, in _open_file_like return _open_file(name_or_buffer, mode) File "/opt/conda/envs/fgt/lib/python3.10/site-packages/torch/serialization.py", line 252, in init super().init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: '/mnt/lustre/sczhou/VQGANs/CodeMOVI/experiments_model/recurrent_flow_completion_v5_train_flowcomp_v5/gen_760000.pth' `