Lower retrained results

Marshall-yao commented 4 years ago

Hi, thanks for your wonderful ICCV2019 work of PFNL . I have a problem that may need your help.

When I retrained the code, the test results are the average of PSNR 27.3196 ,SSIM 0.8353.（21 万 steps）

But when I test the pretrained model , I get the results of PSNR 27.4053 and SSIM 0.8383.

I retrained the model without any change of source code. ( just change max_step to 2.1e5+1, the origin is 1.5e5+1)

Does anyone have some ideas ?

psychopa4 commented 4 years ago

Hi, thanks for your wonderful ICCV2019 work of PFNL . I have a problem that may need your help.

When I retrained the code, the test results are the average of PSNR 27.3196 ,SSIM 0.8353.（21 万 steps）

But when I test the pretrained model , I get the results of PSNR 27.4053 and SSIM 0.8383.

I retrained the model without any change of source code. ( just change max_step to 2.1e5+1, the origin is 1.5e5+1)

Does anyone have some ideas ?

As is presented in the paper, the learning rate is decayed from 1e-3 to 1e-4 during 1.2e5 iterations gradually. Then, we train the network with lr=1e-4 until 1.5e5 iterations, after which we set the learning rate manually like

boundaries=[1.5e5, 1.7e5, 1.9e5] 
values=[1e-4, 0.5e-4, 0.25e-4 ,0.1e-4]

Marshall-yao commented 4 years ago

1) boundary and value in your answer I think the boundary in pfnl.py is max_step and value is end_lr , right? So, the set of max_step and end_lr are 1.5e+5 1e-4 , 1.7e5 0.5e-4 and 1.9e5 0.25e-4 right ?

2) Confirmation of training iterations The iterations( max_step ) in pfnl.py is 1.5e+5 . I think this data is used for ablation experiments. (Figure 5)

Besides, I retrained the model with 1.5e5 iterations(without any change). However, the results are PSNR 27.28033 SSIM 0.834489. The iterations of pretrianed model is 209999. Thus , if we want to get the result on paper by retraining this model , the iterations should be set 2.1e5 instead of 1.5e5 , right ?

Besides, do you use any tricks ?

psychopa4 commented 4 years ago

boundary and value in your answer I think the boundary in pfnl.py is max_step and value is end_lr , right? So, the set of max_step and end_lr are 1.5e+5 1e-4 , 1.7e5 0.5e-4 and 1.9e5 0.25e-4 right ?

Confirmation of training iterations The iterations( max_step ) in pfnl.py is 1.5e+5 . I think this data is used for ablation experiments. (Figure 5)

Besides, I retrained the model with 1.5e5 iterations(without any change). However, the results are PSNR 27.28033 SSIM 0.834489. The iterations of pretrianed model is 209999. Thus , if we want to get the result on paper by retraining this model , the iterations should be set 2.1e5 instead of 1.5e5 , right ?

Besides, do you use any tricks ?

1.5e5 - 1.7e5 : 0.5e-4 1.7e5 - 1.9e5 : 0.25e-4 1.9e5 - 2.1e5 : 0.1e-4
Yes, and it is used for all experiments. First, train the model with original settings. After 1.5e5 iterations, adjust the iterations and learning rate manually as listed above. No tricks are used.

Marshall-yao commented 4 years ago

Thanks so much. I got it.

Besides , do you try to train the model with two 2080ti GPUs or two 1080ti GPUs ? If I want to train the model with two 2080ti GPUs ，how to set hyperparameters , except batchsize doubled ?

psychopa4 commented 4 years ago

Thanks so much. I got it.

Besides , do you try to train the model with two 2080ti GPUs or two 1080ti GPUs ? If I want to train the model with two 2080ti GPUs ，how to set hyperparameters , except batchsize doubled ?

Unfortunately, we do not have a distributed version, and we only train our model on one Nvidia GTX 1080 Ti GPU.

Marshall-yao commented 4 years ago

Thanks so much.
I will have a try.

Marshall-yao commented 4 years ago

@psychopa4

I trained the model as what you said above. Test results are below.

150k iterations 1e-4 : test results are 27.2458,0.831677 170k iterations 0.5e-4 : 27.252,0.833426 190k iteratons 0.25e-4: 27.2722,0.83377 210k iterations 0.1e-4 : 27.3001,0.834906

But the last results are still lower than yours . Have you encounted this problem ? How did you do for this problem ?

psychopa4 commented 4 years ago

@psychopa4

I trained the model as what you said above. Test results are below.

150k iterations 1e-4 : test results are 27.2458,0.831677 170k iterations 0.5e-4 : 27.252,0.833426 190k iteratons 0.25e-4: 27.2722,0.83377 210k iterations 0.1e-4 : 27.3001,0.834906

But the last results are still lower than yours . Have you encounted this problem ? How did you do for this problem ? I did not meet this problem, maybe you could train again.

Marshall-yao commented 4 years ago

@psychopa4
Thanks again.
I will try again.

psychopa4 / PFNL

Lower retrained results #15