Dear senior, thank you very much for your contribution. When I reproduced your code, due to computing power, I reduced the number of sandwich modules to: num_blocks=[2, 3, 3, 4], and the model can run normally. , but under the checkpoints file, only the results are not generating weight files (.pth). I ran your code on a 4080TI (16GB video memory) machine, iterated 135,000 times, and saved a weight every 2,700 times, but none was generated. I don’t know why, but I apologize for the inconvenience during my busy schedule.
In line 178 of HINT.py, a print("++++++++++++++ZZB++++++++++++++") is added to the if judgment sentence of # save model at checkpoints, but it is not printed out when running.
if self.config.SAVE_INTERVAL and iteration % self.config.SAVE_INTERVAL == 0:
print("++++++++++++++ZZB++++++++++++++")
self.save()
print('\nEnd training....')
output.log
start training...
Training epoch: 1
/home/liu/anaconda3/envs/HINT/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG16_Weights.IMAGENET1K_V1. You can also use weights=VGG16_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
1/2700 [....................] - ETA: 4:02:22 - epoch: 1 - iter: 1 - gLoss: 1.1641 - dLoss: 0.2642 - psnr: 26.7716 - mae: 0.0244
2/2700 [....................] - ETA: 2:06:30 - epoch: 1 - iter: 2 - gLoss: 1.0937 - dLoss: 0.2572 - psnr: 27.6237 - mae: 0.0223
3/2700 [....................] - ETA: 1:27:54 - epoch: 1 - iter: 3 - gLoss: 2.0004 - dLoss: 0.2554 - psnr: 28.3510 - mae: 0.0179
.....
40/2700 [....................] - ETA: 18:28 - epoch: 1 - iter: 40 - gLoss: 1.4206 - dLoss: 0.2498 - psnr: 27.3954 - mae: 0.0205/home/liu/ZZB/HINT-main/checkpoints/results/inpaint/joint/0001.png
0001.png complete!
....
2699/2700 [==================>.] - ETA: 0s - epoch: 50 - iter: 134999 - gLoss: 0.1235 - dLoss: 0.2412 - psnr: 33.9847 - mae: 0.0078
End training....
if self.config.SAVE_INTERVAL and iteration % self.config.SAVE_INTERVAL == 0: print("++++++++++++++ZZB++++++++++++++") self.save() print('\nEnd training....') output.log start training... Training epoch: 1 /home/liu/anaconda3/envs/HINT/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or
None
for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passingweights=VGG16_Weights.IMAGENET1K_V1
. You can also useweights=VGG16_Weights.DEFAULT
to get the most up-to-date weights. warnings.warn(msg) 1/2700 [....................] - ETA: 4:02:22 - epoch: 1 - iter: 1 - gLoss: 1.1641 - dLoss: 0.2642 - psnr: 26.7716 - mae: 0.0244 2/2700 [....................] - ETA: 2:06:30 - epoch: 1 - iter: 2 - gLoss: 1.0937 - dLoss: 0.2572 - psnr: 27.6237 - mae: 0.0223 3/2700 [....................] - ETA: 1:27:54 - epoch: 1 - iter: 3 - gLoss: 2.0004 - dLoss: 0.2554 - psnr: 28.3510 - mae: 0.0179 ..... 40/2700 [....................] - ETA: 18:28 - epoch: 1 - iter: 40 - gLoss: 1.4206 - dLoss: 0.2498 - psnr: 27.3954 - mae: 0.0205/home/liu/ZZB/HINT-main/checkpoints/results/inpaint/joint/0001.png 0001.png complete! .... 2699/2700 [==================>.] - ETA: 0s - epoch: 50 - iter: 134999 - gLoss: 0.1235 - dLoss: 0.2412 - psnr: 33.9847 - mae: 0.0078 End training....