linfengWen98 / CAP-VSTNet

[CVPR 2023] CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
MIT License
120 stars 8 forks source link

Faced unexpected pause when training epoch is 16160 #17

Open LT1st opened 7 months ago

LT1st commented 7 months ago

These are the output of console:

Iteration: 00161080/00170000  content_loss:0.0000  lap_loss:0.3854  rec_loss:0.0622  style_loss:1.4862  loss_tmp:0.5256  loss_tmp_GT:0.0664
Iteration: 00161090/00170000  content_loss:0.0000  lap_loss:0.1441  rec_loss:0.1067  style_loss:0.7328  loss_tmp:0.2622  loss_tmp_GT:0.0847
Iteration: 00161100/00170000  content_loss:0.0000  lap_loss:0.0956  rec_loss:0.0610  style_loss:0.3879  loss_tmp:0.4483  loss_tmp_GT:0.0935
Iteration: 00161110/00170000  content_loss:0.0000  lap_loss:0.1170  rec_loss:0.0750  style_loss:0.6948  loss_tmp:0.2367  loss_tmp_GT:0.0769
Iteration: 00161120/00170000  content_loss:0.0000  lap_loss:0.0835  rec_loss:0.0324  style_loss:0.3586  loss_tmp:0.2265  loss_tmp_GT:0.0790
Iteration: 00161130/00170000  content_loss:0.0000  lap_loss:0.1715  rec_loss:0.0607  style_loss:1.0338  loss_tmp:1.1665  loss_tmp_GT:0.0691
Iteration: 00161140/00170000  content_loss:0.0000  lap_loss:0.1329  rec_loss:0.0573  style_loss:0.6555  loss_tmp:0.2451  loss_tmp_GT:0.0630
Iteration: 00161150/00170000  content_loss:0.0000  lap_loss:0.0865  rec_loss:0.0353  style_loss:0.3672  loss_tmp:0.2072  loss_tmp_GT:0.0798
Iteration: 00161160/00170000  content_loss:0.0000  lap_loss:0.1805  rec_loss:0.0556  style_loss:1.0472  loss_tmp:0.4310  loss_tmp_GT:0.0580
Iteration: 00161170/00170000  content_loss:0.0000  lap_loss:0.0714  rec_loss:0.0337  style_loss:0.4977  loss_tmp:0.5335  loss_tmp_GT:0.0828
Iteration: 00161180/00170000  content_loss:0.0000  lap_loss:0.1115  rec_loss:0.0504  style_loss:0.7589  loss_tmp:0.3220  loss_tmp_GT:0.0782
Iteration: 00161190/00170000  content_loss:0.0000  lap_loss:0.0688  rec_loss:0.0449  style_loss:0.3667  loss_tmp:0.3961  loss_tmp_GT:0.0545
Iteration: 00161200/00170000  content_loss:0.0000  lap_loss:0.0567  rec_loss:0.0391  style_loss:0.3564  loss_tmp:0.2393  loss_tmp_GT:0.0682
Iteration: 00161210/00170000  content_loss:0.0000  lap_loss:0.1973  rec_loss:0.3097  style_loss:0.3421  loss_tmp:0.2684  loss_tmp_GT:0.0742
Iteration: 00161220/00170000  content_loss:0.0000  lap_loss:0.1011  rec_loss:0.0443  style_loss:0.4991  loss_tmp:0.7559  loss_tmp_GT:0.0832
Iteration: 00161230/00170000  content_loss:0.0000  lap_loss:0.0907  rec_loss:0.0408  style_loss:0.3279  loss_tmp:0.2799  loss_tmp_GT:0.0609
Iteration: 00161240/00170000  content_loss:0.0000  lap_loss:0.1845  rec_loss:0.1205  style_loss:0.3565  loss_tmp:0.2985  loss_tmp_GT:0.0518
Iteration: 00161250/00170000  content_loss:0.0000  lap_loss:0.2289  rec_loss:0.1843  style_loss:0.3027  loss_tmp:0.2727  loss_tmp_GT:0.0621
Iteration: 00161260/00170000  content_loss:0.0000  lap_loss:0.3555  rec_loss:0.1109  style_loss:1.1843  loss_tmp:0.5432  loss_tmp_GT:0.0804
Iteration: 00161270/00170000  content_loss:0.0000  lap_loss:715.7004  rec_loss:0.9811  style_loss:49.2091  loss_tmp:8.3554  loss_tmp_GT:0.0722
Iteration: 00161280/00170000  content_loss:0.0000  lap_loss:0.3179  rec_loss:0.0679  style_loss:0.5367  loss_tmp:0.3266  loss_tmp_GT:0.0490
Iteration: 00161290/00170000  content_loss:0.0000  lap_loss:0.3358  rec_loss:0.1061  style_loss:0.6838  loss_tmp:0.5130  loss_tmp_GT:0.0722
Iteration: 00161300/00170000  content_loss:0.0000  lap_loss:0.3460  rec_loss:0.0656  style_loss:0.5438  loss_tmp:0.3704  loss_tmp_GT:0.0931
Iteration: 00161310/00170000  content_loss:0.0000  lap_loss:1190.3612  rec_loss:1.0076  style_loss:102.2295  loss_tmp:8.4687  loss_tmp_GT:0.0529
Iteration: 00161320/00170000  content_loss:0.0000  lap_loss:0.2564  rec_loss:0.0999  style_loss:0.4567  loss_tmp:0.3154  loss_tmp_GT:0.0887
Iteration: 00161330/00170000  content_loss:0.0000  lap_loss:0.3323  rec_loss:0.1052  style_loss:1.4866  loss_tmp:0.4579  loss_tmp_GT:0.0910
Iteration: 00161340/00170000  content_loss:0.0000  lap_loss:0.2228  rec_loss:0.0693  style_loss:0.3814  loss_tmp:0.2982  loss_tmp_GT:0.0956
Iteration: 00161350/00170000  content_loss:0.0000  lap_loss:0.3161  rec_loss:0.0936  style_loss:0.7369  loss_tmp:0.5142  loss_tmp_GT:0.0825
Iteration: 00161360/00170000  content_loss:0.0000  lap_loss:0.2863  rec_loss:0.0664  style_loss:0.7711  loss_tmp:0.3755  loss_tmp_GT:0.0543
Iteration: 00161370/00170000  content_loss:0.0000  lap_loss:0.2393  rec_loss:0.0665  style_loss:0.4124  loss_tmp:0.5033  loss_tmp_GT:0.0546
Iteration: 00161380/00170000  content_loss:0.0000  lap_loss:0.4465  rec_loss:0.0993  style_loss:0.8214  loss_tmp:0.3623  loss_tmp_GT:0.0508
Iteration: 00161390/00170000  content_loss:0.0000  lap_loss:0.3830  rec_loss:0.1114  style_loss:0.8339  loss_tmp:0.4083  loss_tmp_GT:0.0753
Iteration: 00161400/00170000  content_loss:0.0000  lap_loss:0.7490  rec_loss:0.0830  style_loss:1.9559  loss_tmp:0.5335  loss_tmp_GT:0.0926
Iteration: 00161410/00170000  content_loss:0.0000  lap_loss:0.4318  rec_loss:0.1619  style_loss:0.3361  loss_tmp:0.4007  loss_tmp_GT:0.0939
Iteration: 00161420/00170000  content_loss:0.0000  lap_loss:0.6868  rec_loss:0.0895  style_loss:0.9060  loss_tmp:1.2179  loss_tmp_GT:0.0785
Iteration: 00161430/00170000  content_loss:0.0000  lap_loss:2.0505  rec_loss:0.1317  style_loss:0.4949  loss_tmp:1.1039  loss_tmp_GT:0.0491
Iteration: 00161440/00170000  content_loss:0.0000  lap_loss:0.9979  rec_loss:0.1391  style_loss:1.0453  loss_tmp:0.6287  loss_tmp_GT:0.0558
Iteration: 00161450/00170000  content_loss:0.0000  lap_loss:1.2907  rec_loss:0.1996  style_loss:0.8235  loss_tmp:0.7697  loss_tmp_GT:0.0757
Iteration: 00161460/00170000  content_loss:0.0000  lap_loss:1.2174  rec_loss:0.2214  style_loss:0.8450  loss_tmp:0.8341  loss_tmp_GT:0.0556
Iteration: 00161470/00170000  content_loss:0.0000  lap_loss:1.5833  rec_loss:0.1535  style_loss:0.8611  loss_tmp:0.7469  loss_tmp_GT:0.0901
Iteration: 00161480/00170000  content_loss:0.0000  lap_loss:1.6554  rec_loss:0.1670  style_loss:0.7574  loss_tmp:0.7843  loss_tmp_GT:0.0714
Iteration: 00161490/00170000  content_loss:0.0000  lap_loss:1.5283  rec_loss:0.1308  style_loss:0.4994  loss_tmp:0.7239  loss_tmp_GT:0.0898
Iteration: 00161500/00170000  content_loss:0.0000  lap_loss:1.4131  rec_loss:0.1164  style_loss:1.0087  loss_tmp:0.6687  loss_tmp_GT:0.0719
Iteration: 00161510/00170000  content_loss:0.0000  lap_loss:1.3814  rec_loss:0.1189  style_loss:0.6020  loss_tmp:0.8305  loss_tmp_GT:0.0644
Iteration: 00161520/00170000  content_loss:0.0000  lap_loss:1.2963  rec_loss:0.1918  style_loss:0.7768  loss_tmp:0.6962  loss_tmp_GT:0.0777
Iteration: 00161530/00170000  content_loss:0.0000  lap_loss:1.3077  rec_loss:0.1180  style_loss:1.2366  loss_tmp:0.6606  loss_tmp_GT:0.0754
Iteration: 00161540/00170000  content_loss:0.0000  lap_loss:1.8840  rec_loss:0.1963  style_loss:0.6856  loss_tmp:0.8398  loss_tmp_GT:0.0790
Iteration: 00161550/00170000  content_loss:0.0000  lap_loss:37.4161  rec_loss:0.4983  style_loss:8.9974  loss_tmp:3.0548  loss_tmp_GT:0.0554
Iteration: 00161560/00170000  content_loss:0.0000  lap_loss:0.9423  rec_loss:0.1765  style_loss:0.5690  loss_tmp:0.9694  loss_tmp_GT:0.0606
Iteration: 00161570/00170000  content_loss:0.0000  lap_loss:0.8936  rec_loss:0.1570  style_loss:0.8383  loss_tmp:0.6511  loss_tmp_GT:0.0804
Iteration: 00161580/00170000  content_loss:0.0000  lap_loss:1.3945  rec_loss:0.4109  style_loss:0.8251  loss_tmp:0.9200  loss_tmp_GT:0.0933
Iteration: 00161590/00170000  content_loss:0.0000  lap_loss:1.4182  rec_loss:0.1355  style_loss:0.8771  loss_tmp:0.7025  loss_tmp_GT:0.0806
Iteration: 00161600/00170000  content_loss:0.0000  lap_loss:2.0692  rec_loss:0.2017  style_loss:0.4177  loss_tmp:0.8337  loss_tmp_GT:0.0946
Iteration: 00161610/00170000  content_loss:0.0000  lap_loss:397.9501  rec_loss:1.3158  style_loss:46.0847  loss_tmp:9.6579  loss_tmp_GT:0.0553
Iteration: 00161620/00170000  content_loss:0.0000  lap_loss:4.0763  rec_loss:0.3362  style_loss:0.6721  loss_tmp:1.3075  loss_tmp_GT:0.0777
Iteration: 00161630/00170000  content_loss:0.0000  lap_loss:10.1120  rec_loss:0.5004  style_loss:1.2209  loss_tmp:1.7422  loss_tmp_GT:0.0882
Iteration: 00161640/00170000  content_loss:0.0000  lap_loss:5.8842  rec_loss:0.3475  style_loss:0.8090  loss_tmp:1.5759  loss_tmp_GT:0.0754
Iteration: 00161650/00170000  content_loss:0.0000  lap_loss:7.2984  rec_loss:0.3977  style_loss:1.8986  loss_tmp:1.9794  loss_tmp_GT:0.0659
Iteration: 00161660/00170000  content_loss:0.0000  lap_loss:16.9144  rec_loss:0.5072  style_loss:1.3880  loss_tmp:3.1661  loss_tmp_GT:0.0762
Iteration: 00161670/00170000  content_loss:0.0000  lap_loss:8.6051  rec_loss:0.4152  style_loss:0.9209  loss_tmp:1.7897  loss_tmp_GT:0.0745
Iteration: 00161680/00170000  content_loss:0.0000  lap_loss:18.7265  rec_loss:0.7623  style_loss:1.7309  loss_tmp:2.4511  loss_tmp_GT:0.0596
Iteration: 00161690/00170000  content_loss:0.0000  lap_loss:26.2579  rec_loss:0.9497  style_loss:3.5073  loss_tmp:3.2847  loss_tmp_GT:0.0746
Iteration: 00161700/00170000  content_loss:0.0000  lap_loss:40.5071  rec_loss:1.2338  style_loss:4.2289  loss_tmp:4.3614  loss_tmp_GT:0.0877
Iteration: 00161710/00170000  content_loss:0.0000  lap_loss:75.8527  rec_loss:1.7527  style_loss:7.1905  loss_tmp:6.1432  loss_tmp_GT:0.0622
Iteration: 00161720/00170000  content_loss:0.0000  lap_loss:132.4727  rec_loss:3.4893  style_loss:10.6970  loss_tmp:7.7992  loss_tmp_GT:0.0869
Iteration: 00161730/00170000  content_loss:0.0000  lap_loss:164.3445  rec_loss:2.3470  style_loss:10.5640  loss_tmp:9.1285  loss_tmp_GT:0.0617
Iteration: 00161740/00170000  content_loss:0.0000  lap_loss:163.4563  rec_loss:1.8969  style_loss:9.1780  loss_tmp:10.1247  loss_tmp_GT:0.0708
Iteration: 00161750/00170000  content_loss:0.0000  lap_loss:418.0580  rec_loss:6.6835  style_loss:18.3620  loss_tmp:14.3527  loss_tmp_GT:0.0823
Iteration: 00161760/00170000  content_loss:0.0000  lap_loss:599.1832  rec_loss:9.9018  style_loss:54.3852  loss_tmp:16.5558  loss_tmp_GT:0.0779
Iteration: 00161770/00170000  content_loss:0.0000  lap_loss:1377.3221  rec_loss:11.8466  style_loss:88.2415  loss_tmp:20.6850  loss_tmp_GT:0.0696
Iteration: 00161780/00170000  content_loss:0.0000  lap_loss:1219.5043  rec_loss:13.1540  style_loss:73.4607  loss_tmp:25.3794  loss_tmp_GT:0.0756
Iteration: 00161790/00170000  content_loss:0.0000  lap_loss:5094.0000  rec_loss:9.5246  style_loss:221.0599  loss_tmp:33.7047  loss_tmp_GT:0.0947
Iteration: 00161800/00170000  content_loss:0.0000  lap_loss:1941.8975  rec_loss:31.7524  style_loss:250.0916  loss_tmp:68.2838  loss_tmp_GT:0.0637
Iteration: 00161810/00170000  content_loss:0.0000  lap_loss:3418.7014  rec_loss:20.3584  style_loss:232.5759  loss_tmp:42.6371  loss_tmp_GT:0.0911
Iteration: 00161820/00170000  content_loss:0.0000  lap_loss:418235104.0000  rec_loss:214.8812  style_loss:9325154.0000  loss_tmp:4957.1558  loss_tmp_GT:0.0785
Iteration: 00161830/00170000  content_loss:0.0000  lap_loss:9133.0684  rec_loss:72.6381  style_loss:698.3317  loss_tmp:58.0809  loss_tmp_GT:0.0579
Iteration: 00161840/00170000  content_loss:0.0000  lap_loss:9114.7314  rec_loss:48.7064  style_loss:624.2943  loss_tmp:56.9594  loss_tmp_GT:0.0858
Iteration: 00161850/00170000  content_loss:0.0000  lap_loss:16554.5078  rec_loss:104.8364  style_loss:1542.5042  loss_tmp:88.9630  loss_tmp_GT:0.0712
Iteration: 00161860/00170000  content_loss:0.0000  lap_loss:10247.7246  rec_loss:65.9900  style_loss:1027.9641  loss_tmp:96.4846  loss_tmp_GT:0.0727
Iteration: 00161870/00170000  content_loss:0.0000  lap_loss:19196.0527  rec_loss:77.2881  style_loss:1428.4135  loss_tmp:125.3436  loss_tmp_GT:0.0677
Iteration: 00161880/00170000  content_loss:0.0000  lap_loss:216289.6719  rec_loss:98.5644  style_loss:14655.6758  loss_tmp:218.4098  loss_tmp_GT:0.0702
Iteration: 00161890/00170000  content_loss:0.0000  lap_loss:19604.2520  rec_loss:50.9366  style_loss:1325.8600  loss_tmp:95.0826  loss_tmp_GT:0.0942
Iteration: 00161900/00170000  content_loss:0.0000  lap_loss:93659.1016  rec_loss:297.3892  style_loss:5191.9561  loss_tmp:227.0423  loss_tmp_GT:0.0611
Iteration: 00161910/00170000  content_loss:0.0000  lap_loss:86273.3594  rec_loss:174.2626  style_loss:4537.2666  loss_tmp:169.1176  loss_tmp_GT:0.0884
Iteration: 00161920/00170000  content_loss:0.0000  lap_loss:100730.4844  rec_loss:231.1616  style_loss:8772.8340  loss_tmp:295.8207  loss_tmp_GT:0.0779
Iteration: 00161930/00170000  content_loss:0.0000  lap_loss:389786.8125  rec_loss:618.0142  style_loss:26791.2461  loss_tmp:366.0463  loss_tmp_GT:0.0742
Iteration: 00161940/00170000  content_loss:0.0000  lap_loss:15906467840.0000  rec_loss:14955.5361  style_loss:303860064.0000  loss_tmp:72561.6250  loss_tmp_GT:0.0740

The nvidia-smi gives the info below:

Tue Feb 13 11:46:41 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02    Driver Version: 525.89.02    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:5E:00.0  On |                  N/A |
| 51%   38C    P2   103W / 350W |  13277MiB / 24576MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:AF:00.0 Off |                  N/A |
|  0%   39C    P8    15W / 350W |      5MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     16179      G   /usr/lib/xorg/Xorg                153MiB |
|    0   N/A  N/A    610515      C   python                          13120MiB |
|    1   N/A  N/A     16179      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+

Should I give up training?

LT1st commented 7 months ago

I leave everything unchanged, and train it on my own dataset.

linfengWen98 commented 6 months ago

This may be caused by the gradient problem of Cholesky decomposition. The code has been updated.

musetee commented 5 months ago

I have applied your network on the synthRad dataset and passed the gray-scaled CT and MRI one-channel images to three-channels by simply concatenating. But on original training setting there were always warnings of "Cholesky Decomposition fails. Gradient infinity. Skip current batch." Could you give me any advice? Thanks a lot for your excellent work :)

LT1st commented 5 months ago

This may be caused by the gradient problem of Cholesky decomposition. The code has been updated.

非常感谢您的建议

祝生活愉快

linfengWen98 commented 5 months ago

I have applied your network on the synthRad dataset and passed the gray-scaled CT and MRI one-channel images to three-channels by simply concatenating. But on original training setting there were always warnings of "Cholesky Decomposition fails. Gradient infinity. Skip current batch." Could you give me any advice? Thanks a lot for your excellent work :)

Sorry, the parameter 'use_double' in cWCT.py should be set to True. The code has been updated.

LT1st commented 4 months ago

I am using the updated code, but this error goes as usual

Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019320/00170000  content_loss:0.0000  lap_loss:159.2791  rec_loss:1.5052  style_loss:2.7486  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                                
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019330/00170000  content_loss:0.0000  lap_loss:157.6614  rec_loss:1.9370  style_loss:3.4438  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                                
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019340/00170000  content_loss:0.0000  lap_loss:275.8496  rec_loss:2.7183  style_loss:4.6778  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                                
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019350/00170000  content_loss:0.0000  lap_loss:354.6751  rec_loss:3.2100  style_loss:6.2185  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                                
Iteration: 00019360/00170000  content_loss:0.0000  lap_loss:559.2527  rec_loss:3.3058  style_loss:16.2796  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                               
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019370/00170000  content_loss:0.0000  lap_loss:662.8289  rec_loss:2.8588  style_loss:15.7534  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                               
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019380/00170000  content_loss:0.0000  lap_loss:1228.8931  rec_loss:6.4255  style_loss:24.9516  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                              
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019390/00170000  content_loss:0.0000  lap_loss:1417.4567  rec_loss:5.4163  style_loss:35.2438  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                              
Iteration: 00019400/00170000  content_loss:0.0000  lap_loss:1618.0776  rec_loss:8.5358  style_loss:32.9446  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                              
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019410/00170000  content_loss:0.0000  lap_loss:2457.8999  rec_loss:12.6839  style_loss:135.8649  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                            
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019420/00170000  content_loss:0.0000  lap_loss:2929.5208  rec_loss:11.8296  style_loss:45.2074  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                             
Iteration: 00019430/00170000  content_loss:0.0000  lap_loss:5270.0308  rec_loss:17.2902  style_loss:98.4670  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                             
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019440/00170000  content_loss:0.0000  lap_loss:12780.7393  rec_loss:53.6309  style_loss:517.0043  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                           
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019450/00170000  content_loss:0.0000  lap_loss:33461.5977  rec_loss:85.3787  style_loss:1157.2053  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                          
Iteration: 00019460/00170000  content_loss:0.0000  lap_loss:47801.2383  rec_loss:110.6021  style_loss:927.8384  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                          
Iteration: 00019470/00170000  content_loss:0.0000  lap_loss:105222.0625  rec_loss:200.1128  style_loss:3148.9697  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                        
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00019480/00170000  content_loss:0.0000  lap_loss:245005.7812  rec_loss:699.6542  style_loss:7568.2607  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                        
Iteration: 00019490/00170000  content_loss:0.0000  lap_loss:301240.6875  rec_loss:538.1981  style_loss:10290.3311  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                       
Iteration: 00019500/00170000  content_loss:0.0000  lap_loss:452129.2812  rec_loss:402.8548  style_loss:13304.7422  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                       
Iteration: 00019510/00170000  content_loss:0.0000  lap_loss:781383.0625  rec_loss:497.7421  style_loss:20818.8828  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                       
Iteration: 00019520/00170000  content_loss:0.0000  lap_loss:1462175.7500  rec_loss:561.4804  style_loss:24778.1133  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                      
Iteration: 00019530/00170000  content_loss:0.0000  lap_loss:2074892.6250  rec_loss:915.9771  style_loss:46083.4766  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                      
Iteration: 00019540/00170000  content_loss:0.0000  lap_loss:2088487.0000  rec_loss:2681.4905  style_loss:54167.8203  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                     
Iteration: 00019550/00170000  content_loss:0.0000  lap_loss:3387649.0000  rec_loss:1232.0409  style_loss:76106.6094  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                     
Iteration: 00019560/00170000  content_loss:0.0000  lap_loss:10170006.0000  rec_loss:3335.5005  style_loss:321179.4375  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                   
Iteration: 00019570/00170000  content_loss:0.0000  lap_loss:11045876.0000  rec_loss:1911.4858  style_loss:231068.3438  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                   
Iteration: 00019580/00170000  content_loss:0.0000  lap_loss:13109793.0000  rec_loss:4361.4756  style_loss:309041.6875  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                   
Iteration: 00019590/00170000  content_loss:0.0000  lap_loss:23149722.0000  rec_loss:6688.6255  style_loss:791635.1250  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                   
Iteration: 00019600/00170000  content_loss:0.0000  lap_loss:13212357.0000  rec_loss:4038.3118  style_loss:313771.0938  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                   
Iteration: 00019610/00170000  content_loss:0.0000  lap_loss:63433156.0000  rec_loss:5623.5425  style_loss:2108203.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                  
Iteration: 00019620/00170000  content_loss:0.0000  lap_loss:54538236.0000  rec_loss:7540.0439  style_loss:1154739.5000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                  
Iteration: 00019630/00170000  content_loss:0.0000  lap_loss:133052504.0000  rec_loss:7938.0957  style_loss:2468518.7500  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                 
Iteration: 00019640/00170000  content_loss:0.0000  lap_loss:219064544.0000  rec_loss:14273.8584  style_loss:4771924.5000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                
Iteration: 00019650/00170000  content_loss:0.0000  lap_loss:1157315584.0000  rec_loss:49690.8906  style_loss:33978448.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                              
Iteration: 00019660/00170000  content_loss:0.0000  lap_loss:652230848.0000  rec_loss:33877.5156  style_loss:22673144.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                               
Iteration: 00019670/00170000  content_loss:0.0000  lap_loss:260332928.0000  rec_loss:14764.3955  style_loss:6373633.5000  loss_tmp:0.0000  loss_tmp_GT:0.0000 
Iteration: 00019700/00170000  content_loss:0.0000  lap_loss:175779216.0000  rec_loss:12537.4365  style_loss:4020703.2500  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                
Iteration: 00019710/00170000  content_loss:0.0000  lap_loss:244010336.0000  rec_loss:14608.0186  style_loss:4740527.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                
Iteration: 00019720/00170000  content_loss:0.0000  lap_loss:507039232.0000  rec_loss:38317.7891  style_loss:9450994.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                
Iteration: 00019730/00170000  content_loss:0.0000  lap_loss:186500992.0000  rec_loss:9350.5967  style_loss:3021213.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                 
Iteration: 00019740/00170000  content_loss:0.0000  lap_loss:175715888.0000  rec_loss:7774.5225  style_loss:3432000.2500  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                 
Iteration: 00019750/00170000  content_loss:0.0000  lap_loss:525769280.0000  rec_loss:17648.4688  style_loss:9259133.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                
Iteration: 00019760/00170000  content_loss:0.0000  lap_loss:1385499648.0000  rec_loss:17735.6582  style_loss:21579124.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                              
Iteration: 00019770/00170000  content_loss:0.0000  lap_loss:2157881856.0000  rec_loss:31950.2969  style_loss:37357400.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                              
Iteration: 00019780/00170000  content_loss:0.0000  lap_loss:3972313088.0000  rec_loss:104818.9844  style_loss:87564992.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                             
Iteration: 00019790/00170000  content_loss:0.0000  lap_loss:8301088256.0000  rec_loss:57633.3555  style_loss:158462496.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                             
Iteration: 00019800/00170000  content_loss:0.0000  lap_loss:27188193280.0000  rec_loss:136121.8281  style_loss:430402016.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                           
Iteration: 00019810/00170000  content_loss:0.0000  lap_loss:109260144640.0000  rec_loss:434222.3750  style_loss:1828851456.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                         
Iteration: 00019820/00170000  content_loss:0.0000  lap_loss:314164772864.0000  rec_loss:1450301.0000  style_loss:7710819328.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                        
Iteration: 00019830/00170000  content_loss:0.0000  lap_loss:601583976448.0000  rec_loss:649785.0625  style_loss:10285472768.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                        
Iteration: 00019840/00170000  content_loss:0.0000  lap_loss:445602758656.0000  rec_loss:323235.3750  style_loss:7091161088.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                         
Iteration: 00019850/00170000  content_loss:0.0000  lap_loss:663548264448.0000  rec_loss:465235.5625  style_loss:11168151552.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                        
Iteration: 00019860/00170000  content_loss:0.0000  lap_loss:2332661645312.0000  rec_loss:2389557.0000  style_loss:48662765568.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                      
Iteration: 00019870/00170000  content_loss:0.0000  lap_loss:1990058049536.0000  rec_loss:1574681.0000  style_loss:36620349440.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                      
Iteration: 00019880/00170000  content_loss:0.0000  lap_loss:2676915961856.0000  rec_loss:494977.8438  style_loss:38082404352.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                       
Iteration: 00019890/00170000  content_loss:0.0000  lap_loss:3788393152512.0000  rec_loss:1284546.6250  style_loss:81677107200.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                      
Iteration: 00019900/00170000  content_loss:0.0000  lap_loss:3168595869696.0000  rec_loss:550553.0000  style_loss:47469285376.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                       
Iteration: 00019910/00170000  content_loss:0.0000  lap_loss:3099389591552.0000  rec_loss:2370434.0000  style_loss:72614346752.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                      
Iteration: 00019920/00170000  content_loss:0.0000  lap_loss:3503058845696.0000  rec_loss:2100743.2500  style_loss:65855000576.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                      
Iteration: 00019930/00170000  content_loss:0.0000  lap_loss:2580092813312.0000  rec_loss:1831351.8750  style_loss:43814285312.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                      
Iteration: 00019940/00170000  content_loss:0.0000  lap_loss:1809156145152.0000  rec_loss:566731.1250  style_loss:26138923008.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                       
Iteration: 00019950/00170000  content_loss:0.0000  lap_loss:3197836722176.0000  rec_loss:2696372.7500  style_loss:57295392768.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                      
Iteration: 00019960/00170000  content_loss:0.0000  lap_loss:11973642158080.0000  rec_loss:1680209.7500  style_loss:193298317312.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                    
Iteration: 00019970/00170000  content_loss:0.0000  lap_loss:28124095971328.0000  rec_loss:1428778.6250  style_loss:484218535936.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                    
Iteration: 00019980/00170000  content_loss:0.0000  lap_loss:27247220097024.0000  rec_loss:2255443.0000  style_loss:386921431040.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                    
Iteration: 00019990/00170000  content_loss:0.0000  lap_loss:29163742298112.0000  rec_loss:2618583.2500  style_loss:418054569984.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                    
Iteration: 00020000/00170000  content_loss:0.0000  lap_loss:31372972392448.0000  rec_loss:2565462.7500  style_loss:496382509056.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                    
Iteration: 00020010/00170000  content_loss:0.0000  lap_loss:39647988154368.0000  rec_loss:5251524.0000  style_loss:583873921024.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                    
Iteration: 00020020/00170000  content_loss:0.0000  lap_loss:46498096087040.0000  rec_loss:2221302.7500  style_loss:604987326464.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                    
Iteration: 00020030/00170000  content_loss:0.0000  lap_loss:93096444428288.0000  rec_loss:5183118.0000  style_loss:1156396875776.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                   
Iteration: 00020040/00170000  content_loss:0.0000  lap_loss:573172378238976.0000  rec_loss:29055988.0000  style_loss:10539991302144.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                
Iteration: 00020050/00170000  content_loss:0.0000  lap_loss:543102305566720.0000  rec_loss:8868641.0000  style_loss:7797688238080.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                  
Iteration: 00020060/00170000  content_loss:0.0000  lap_loss:477242605961216.0000  rec_loss:7353300.0000  style_loss:6330485047296.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                  
Iteration: 00020070/00170000  content_loss:0.0000  lap_loss:719586605400064.0000  rec_loss:9596196.0000  style_loss:10513586061312.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                 
Iteration: 00020080/00170000  content_loss:0.0000  lap_loss:2917163799150592.0000  rec_loss:16187840.0000  style_loss:39120764141568.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                               
Cholesky Decomposition fails. Gradient infinity. Skip current batch.                                                                                                                                                         
Iteration: 00020090/00170000  content_loss:0.0000  lap_loss:6823342065582080.0000  rec_loss:26745362.0000  style_loss:92027928707072.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                               
Iteration: 00020100/00170000  content_loss:0.0000  lap_loss:14940274593628160.0000  rec_loss:117756656.0000  style_loss:230195063685120.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                            
Iteration: 00020110/00170000  content_loss:0.0000  lap_loss:45787877543510016.0000  rec_loss:377901056.0000  style_loss:712197583929344.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                            
Iteration: 00020120/00170000  content_loss:0.0000  lap_loss:47623589515493376.0000  rec_loss:91975888.0000  style_loss:718440654438400.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                             
Iteration: 00020130/00170000  content_loss:0.0000  lap_loss:65257887714246656.0000  rec_loss:67180312.0000  style_loss:992186569064448.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                             
Iteration: 00020140/00170000  content_loss:0.0000  lap_loss:41286017377894400.0000  rec_loss:327302144.0000  style_loss:644938261856256.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                            
Iteration: 00020150/00170000  content_loss:0.0000  lap_loss:37427715111911424.0000  rec_loss:110797008.0000  style_loss:580690785599488.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                            
Iteration: 00020160/00170000  content_loss:0.0000  lap_loss:123380915626835968.0000  rec_loss:105322232.0000  style_loss:1489347824058368.0000  loss_tmp:0.0000  loss_tmp_GT:0.0000                                          
Iteration: 00020170/00170000  content_loss:0.0000  lap_loss:nan  rec_loss:nan  style_loss:nan  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                                           
Iteration: 00020180/00170000  content_loss:0.0000  lap_loss:nan  rec_loss:nan  style_loss:nan  loss_tmp:0.0000  loss_tmp_GT:0.0000                                                                                           
Iteration: 00020190/00170000  content_loss:0.0000  lap_loss:nan  rec_loss:nan  style_loss:nan  loss_tmp:0.0000  loss_tmp_GT:0.0000   
linfengWen98 commented 4 months ago

I train the model on COCO dataset and the training is stable. Please check your dataset if the code is the same.

Iteration: 00169900/00170000  content_loss:0.0000  lap_loss:0.0377  rec_loss:0.0686  style_loss:0.8494  loss_tmp:0.4747  loss_tmp_GT:0.0703
Iteration: 00169910/00170000  content_loss:0.0000  lap_loss:0.0928  rec_loss:0.1001  style_loss:2.1163  loss_tmp:0.3042  loss_tmp_GT:0.0815
Iteration: 00169920/00170000  content_loss:0.0000  lap_loss:0.1460  rec_loss:0.0882  style_loss:1.8143  loss_tmp:0.4197  loss_tmp_GT:0.0778
Iteration: 00169930/00170000  content_loss:0.0000  lap_loss:0.0464  rec_loss:0.0757  style_loss:1.1384  loss_tmp:0.3250  loss_tmp_GT:0.0662
Iteration: 00169940/00170000  content_loss:0.0000  lap_loss:0.1805  rec_loss:0.1085  style_loss:2.7221  loss_tmp:0.4230  loss_tmp_GT:0.0891
Iteration: 00169950/00170000  content_loss:0.0000  lap_loss:0.0585  rec_loss:0.1119  style_loss:1.3657  loss_tmp:0.2979  loss_tmp_GT:0.0695
Iteration: 00169960/00170000  content_loss:0.0000  lap_loss:0.1552  rec_loss:0.0925  style_loss:1.8784  loss_tmp:0.3627  loss_tmp_GT:0.0542
Iteration: 00169970/00170000  content_loss:0.0000  lap_loss:0.0782  rec_loss:0.1287  style_loss:1.5004  loss_tmp:0.5541  loss_tmp_GT:0.0942
Iteration: 00169980/00170000  content_loss:0.0000  lap_loss:0.0465  rec_loss:0.0897  style_loss:1.0761  loss_tmp:0.3871  loss_tmp_GT:0.0488
Iteration: 00169990/00170000  content_loss:0.0000  lap_loss:0.0396  rec_loss:0.1487  style_loss:1.0013  loss_tmp:0.2502  loss_tmp_GT:0.0757
Iteration: 00170000/00170000  content_loss:0.0000  lap_loss:0.2025  rec_loss:0.0613  style_loss:1.9088  loss_tmp:0.3521  loss_tmp_GT:0.0695