Closed N0manDemo closed 3 years ago
Hello! Can you share your options configuration file?
Ah, I didn't see the error.log. So for PPON, you need to configure the losses (type, weights, etc) as you would normally first and then pick which of the losses will be used for which stage. In your case, your configuration should look something like this:
pixel_criterion: l1
pixel_weight: 1e-2
cx_weight: 0.5
cx_type: contextual
cx_vgg_layers: {conv_3_2: 1, conv_4_2: 1}
ssim_type: ms-ssim
ssim_weight: 1
ms_criterion: multiscale-l1
ms_weight: 1e-2
gan_type: vanilla
gan_weight: 0.005
p1_losses: ['pix']
p2_losses: ['pix-multiscale', 'ms-ssim']
p3_losses: ['contextual']
So you see pixel loss, multiscale pixel loss, multiscale SSIM and contextual loss are configured. Let me know if this fixes the problem.
Thank you, that fixed the problem. I was missing quite a few options from the list.
I was training a model with PPON (192) + MultiScale + Diffaug, and I receive the following error when moving to Phase 2: I have AMP disabled because my GPU doesn't support it. error.log
21-01-27 11:26:52.449 - INFO: Random seed: 0 21-01-27 11:26:52.647 - INFO: Dataset [LRHRDataset - DIV2K] is created. 21-01-27 11:26:52.647 - INFO: Number of train images: 37,933, iters: 2,371 21-01-27 11:26:52.647 - INFO: Total epochs needed: 43 for iters 100,000 21-01-27 11:26:52.648 - INFO: Dataset [LRHRDataset - val_set14_part] is created. 21-01-27 11:26:52.648 - INFO: Number of val images in [val_set14_part]: 1 21-01-27 11:26:52.650 - INFO: AMP library available 21-01-27 11:26:52.827 - INFO: Initialization method [kaiming] 21-01-27 11:26:54.127 - INFO: Initialization method [kaiming] 21-01-27 11:26:54.185 - INFO: Loading pretrained model for G [../experiments/pretrained_models/PPON_G.pth] ... 21-01-27 11:26:55.276 - INFO: Network G structure: DataParallel - PPON, with parameters: 17,267,657 21-01-27 11:26:55.277 - INFO: Network D structure: DataParallel - MultiscaleDiscriminator, with parameters: 8,296,899 21-01-27 11:26:55.277 - INFO: Model [PPONModel] is created. 21-01-27 11:26:55.277 - INFO: Start training from epoch: 0, iter: 0 21-01-27 11:26:55.991 - INFO: Switching to phase: p2, step: 1 Traceback (most recent call last): File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 382, in
main()
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 378, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 221, in fit
model.optimize_parameters(virtual_step) # calculate loss functions, get gradients, update network weights
File "/mnt/ext4-storage/Training/BasicSR/codes/models/ppon_model.py", line 199, in optimize_parameters
l_g_total.backward()
AttributeError: 'float' object has no attribute 'backward'