isl-org / PhotorealismEnhancement

Code & Data for Enhancing Photorealism Enhancement
1.3k stars 104 forks source link

Is discriminator training properly? #27

Closed TiPEX360 closed 2 years ago

TiPEX360 commented 2 years ago

Hi, Below is the snippet of the training log from training on a similar dataset I put together. It featues a real dataset from Houzz and an artificial dataset of GTAV buildings. Could you explain why I get a get loss update at each iteration for the generator but only occasional updates for the discriminator? I can't tell if this is intentional or if I made a mistake along the way.

I've also noticed that for the .mat files saved in /out store ['i_fake'] and ['i_real] correctly, but ['i_rec_fake] is filled with NaN. Testing the network produces entirely black images.

Sorry, this is probably too wide to format correctly... 2022-04-19 22:53:09,737 346880 rdf0 ds0 rdf1 ds1 rdf2 ds2 rdf3 ds3 rdf4 ds4 rdf5 ds5 rdf6 ds6 rdf7 ds7 rdf8 ds8 rdf9 ds9 rdr0 rdr1 rdr2 rdr3 rdr4 rdr5 rdr6 rdr7 rdr8 rdr9 reg gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 vgg 2022-04-19 22:53:09,737 346880 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:10,071 346881 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.02 0.99 0.99 1.04 0.99 0.99 1.03 0.98 0.97 1.05 0.72 2022-04-19 22:53:11,022 346882 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:11,034 346883 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.01 1.02 1.00 1.06 1.03 1.01 1.08 0.98 1.00 1.04 0.64 2022-04-19 22:53:11,983 346884 0.00 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:12,328 346885 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.04 1.03 0.99 1.04 1.02 1.02 1.07 0.97 0.97 1.03 0.24 2022-04-19 22:53:13,278 346886 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:13,625 346887 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.03 1.02 0.99 1.05 1.03 1.01 1.07 0.97 0.98 1.03 0.50 2022-04-19 22:53:14,576 346888 ---- ---- ---- ---- ---- ---- ---- ---- 0.00 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 ---- ---- ---- ---- ---- 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:14,588 346889 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 1.01 0.99 1.06 1.04 1.02 1.07 0.98 1.00 1.04 0.70 2022-04-19 22:53:15,540 346890 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 0.00 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 ---- ---- ---- ---- 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:15,551 346891 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.01 1.02 1.00 1.06 1.05 0.93 1.09 0.97 1.00 1.04 0.79 2022-04-19 22:53:16,497 346892 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:16,509 346893 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 0.99 1.02 1.00 1.06 1.05 0.92 1.09 0.98 1.01 1.04 0.80 2022-04-19 22:53:17,461 346894 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:17,472 346895 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 1.02 1.00 1.06 1.04 0.93 1.09 0.98 1.01 1.04 0.73 2022-04-19 22:53:18,424 346896 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:18,436 346897 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.02 1.02 1.00 1.05 1.03 0.94 1.08 0.98 1.00 1.03 0.58 2022-04-19 22:53:19,387 346898 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:19,399 346899 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 1.02 1.00 1.06 1.05 0.93 1.09 0.98 1.01 1.04 0.73

TiPEX360 commented 2 years ago

So I have since narrowed down the issue: the problem began after I tried to load a previously saved network state using name_load in the config for a 'train' config. If I start from scratch I don't get the issue. Any ideas why loading a savegame for training might cause problems?

srrichter commented 2 years ago

The discriminator networks are only updated from time to time depending on their performance in classifying real and fake images. If one of them gets too strong, we skip backpropagation for a couple of iterations so the generator can catch up. This is the adaptive backpropagation in the paper. I have no idea about the issue with loading savegames as we have successfully saved and loaded savegames for continuing training. It likely depends on the specific config you are using there.

lucamarini22 commented 1 year ago

Hi, Could you please explain me what .mat files stores? And in particular, what ['i_fake'] and ['i_real] and ['i_rec_fake] store and represent?

TiPEX360 commented 1 year ago

@lucamarini22 .mat essentially stores the model inputs/outputs for that iteration. ''i_fake' is the input image, 'i_rec_fake' is the generated output image.