gordicaleksa / pytorch-neural-style-transfer

Reconstruction of the original paper on neural style transfer (Gatys et al.). I've additionally included reconstruction scripts which allow you to reconstruct only the content or the style of the image - for better understanding of how NST works.
https://youtube.com/c/TheAIEpiphany
MIT License
368 stars 78 forks source link

Changing height value doesn't produce any results. #2

Open gateway opened 3 years ago

gateway commented 3 years ago

Editing the neural_style_transfer.py and changing the default value for --height or using --height value on the command line produces no end result. The data/output-images folder created is blank.

Edited the default value and change it to 1280 for a photo I wanted to use. I have a Titan RTX

(pytorch-nst) gateway@gateway-media:~/work/ns/pytorch-neural-style-transfer$ python neural_style_transfer.py --content_img_name i1.jpg --style_img_name tre.jpg
Using vgg19 in the optimization procedure.
L-BFGS | iteration: 000, total loss=2920575401984.0000, content_loss=      0.0000, style loss=2920527360000.0000, tv loss=48057560.0000
L-BFGS | iteration: 001, total loss=2920575401984.0000, content_loss=      0.0001, style loss=2920527360000.0000, tv loss=48057560.0000
L-BFGS | iteration: 002, total loss=2920575401984.0000, content_loss=      0.0001, style loss=2920527360000.0000, tv loss=48057560.0000
L-BFGS | iteration: 003, total loss=2920575401984.0000, content_loss=      0.0001, style loss=2920527360000.0000, tv loss=48057560.0000
L-BFGS | iteration: 004, total loss=2920575401984.0000, content_loss=      0.0001, style loss=2920527360000.0000, tv loss=48057560.0000

Uses the command line switch

(pytorch-nst) gateway@gateway-media:~/work/ns/pytorch-neural-style-transfer$ python neural_style_transfer.py --content_img_name i1.jpg --style_img_name tre.jpg --height 1280
Using vgg19 in the optimization procedure.
L-BFGS | iteration: 000, total loss=10584828411904.0000, content_loss=      0.0000, style loss=10584816000000.0000, tv loss=12864146.0000
L-BFGS | iteration: 001, total loss=10584828411904.0000, content_loss=      0.0002, style loss=10584816000000.0000, tv loss=12864146.0000
L-BFGS | iteration: 002, total loss=10584828411904.0000, content_loss=      0.0002, style loss=10584816000000.0000, tv loss=12864146.0000
L-BFGS | iteration: 003, total loss=10584828411904.0000, content_loss=      0.0002, style loss=10584816000000.0000, tv loss=12864146.0000
L-BFGS | iteration: 004, total loss=10584828411904.0000, content_loss=      0.0002, style loss=10584816000000.0000, tv loss=12864146.0000
(pytorch-nst) gateway@gateway-media:~/work/ns/pytorch-neural-style-transfer$ 

nvidia-smi

Tue Oct 13 16:35:56 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  TITAN RTX           Off  | 00000000:01:00.0 Off |                  N/A |
| 41%   29C    P8    15W / 280W |    292MiB / 24220MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 00000000:02:00.0 Off |                  N/A |
| 21%   33C    P8     5W / 180W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2061      G   /usr/lib/xorg/Xorg                191MiB |
|    0   N/A  N/A      2745      G   ...mviewer/tv_bin/TeamViewer       13MiB |
|    0   N/A  N/A      2949      G   /usr/bin/gnome-shell               83MiB |
+-----------------------------------------------------------------------------+
leoliuf commented 2 years ago

Same issue here too

leoliuf commented 2 years ago

I found out if you use Adam optimizer, you will be OK to use a larger size. However, if you keep using LBFGS, increasing the learning rate can also help avoid this glitch.

Finerrkekz commented 2 years ago

Also having this issue, I'm able to go up to 2500 pixels on the caffee model but this one seems to break at the 4th iteration, no idea how to fix it. I'm disappointed because the results are otherwise fantastic, it's just unstable at higher resolutions. I've been trying to achieve similar results to this repo in others but I haven't been successful yet.

Finerrkekz commented 2 years ago

Fixed the issue, you just need to scale tv/style/content weight by the height. If you don't the value for content loss won't go up and it doesn't know what to do and just ceases. Currently on interation 264 at 2200 pixels with CPU to get around VRAM limitations. Fingers crossed it remains stable.

sbetzin commented 2 years ago

@Finerrkekz Would you mind share this piece of code? I had have seen that the loss did not change and the LBFGS did just stop. I have tried multiple tv/style/content variations without success. Interesting side aspect: When running accidently on the CPU it worked with the default values.

CycloneRing commented 8 months ago

any one fixed this issue?