Inference error - Githubissues

SeungjunNah / DeepDeblur-PyTorch

Deep Multi-scale CNN for Dynamic Scene Deblurring

MIT License

262 stars 42 forks source link

Open davidvct opened 7 months ago

davidvct commented 7 months ago

I used this command for inference but encountered issue. Anyone knows how to fix this?
- command: python launch.py --n_GPUs 1 main.py --batch_size 8 --precision single
- error : [W socket.cpp:401] [c10d] The server socket has failed to bind to [::]:8023 (errno: 98 - Address already in use). [W socket.cpp:401] [c10d] The server socket has failed to bind to workstation2:8023 (errno: 98 - Address already in use). [E socket.cpp:435] [c10d] The server socket has failed to listen on any local network address.
Another question is, how do I specify which model to use for inference?

SeungjunNah commented 7 months ago

Are you launching many jobs from a single machine? Use different master ports per job. https://github.com/SeungjunNah/DeepDeblur-PyTorch/blob/master/src/option.py#L33
Please provide more details.

davidvct commented 7 months ago

Changed the port and it works. Thanks!
I was asking how to perform inference on test datasets with specific saved model. Now I managed to get the prediction running, using the below command: python launch.py --n_GPUs 1 main.py --save_dir 2024-04-01_14-13-08 --do_train False --do_validate False --start_epoch 270 --load_epoch 270

But there are two issues I encountered:
- the inference only perform images on some images. I have 60 test images in a test folder but only 6 were predicted.
- after finished prediction with model-270.pt, the script will proceed predict with model-280.pt. It is not a big issue, but something to consider for future improvement.

SeungjunNah commented 7 months ago

Please refer to Usage examples - Example commands

# save all of the evaluation results
python main.py --n_GPUs 1 --batch_size 8 --dataset GOPRO_Large --save_results all

Please refer to args.end_epoch and see how it is used in main.py.