About checkpoint saved - Githubissues

plemeri / UACANet

Official PyTorch implementation of UACANet: Uncertainty Augmented Context Attention for Polyp Segmentation (ACMMM 2021)

MIT License

142 stars 37 forks source link

In run/Train.py, Line122

if epoch % opt.Train.Checkpoint.checkpoint_epoch == 0: torch.save(model.module.state_dict() if args.device_num > 1 else model.state_dict( ), os.path.join(opt.Train.Checkpoint.checkpoint_dir, 'latest.pth'))

To my understanding, this code fragment save checkpoint by each 20 epochs , this can not ensure the checkpoint saved is the optimal during training.

And in Line 130, if args.local_rank <= 0: torch.save(model.module.state_dict() if args.device_num > 1 else model.state_dict( ), os.path.join(opt.Train.Checkpoint.checkpoint_dir, 'latest.pth')) this code just save the weights of last epoch, and it also can not ensure the checkpoint saved is the optimal.

Why don't you save the optimal checkpoint? Would you mind explaining it for me? Many thanks to you！ Happy New Year!

plemeri / UACANet

About checkpoint saved #14