jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.7k forks source link

What possibilities do I get with titan pascal sli #351

Open Grume opened 7 years ago

Grume commented 7 years ago

I have plans to buy 2 titans. Will it be possible to use 24 GB for one image at once? What will give nvlink technology for this project?

jcjohnson commented 7 years ago

neural-style cannot distribute computation over multiple GPUs. It would be theoretically possible, but a bit of a pain to implement.

Grume commented 7 years ago

@jcjohnson thx for fast reply! if theoritically its possible, how long it might take to develop and at what price it can be ordered?

endymion commented 7 years ago

A way to split neural-style over multiple GPUs so that artists could achieve higher-resolution output would be a gift to the world. I'm still on a quest to do any kind of style transfer with output in the 4K+ range so that it could be used as a serious art tool for printing on paper. The main limitation is fitting the entire thing into one GPU. It's easy to rent a cloud instance with a bunch of GPUs or buy multiple GPUs but even if you have the money you still can't achieve high-resolution output with any implementation that I've found, even with 24+ GB of GPU memory and optimized settings.

crasse2 commented 7 years ago

If the process over a single source image can't be distributed over multiple GPUs, maybe another solution for multi GPU processing is closer to the tiling system that some people are trying to build (in another thread, here https://github.com/jcjohnson/neural-style/issues/337). I didn't have time to try it yet, but this could be a way (dividing the source into tiles, and send each tile neural-art process to different GPUs at once) ?

htoyryla commented 7 years ago

Another problem is when the image resolution increases, the receptive fields of the CNN see smaller parts of the image and the style of the output image is no longer the same as for a smaller image. Also the tiling method suffers from the same problem.

This phenomenon has been mentioned now and then since neural-style was introduced and I have for a long time suspected that it happens because the model works best with a certain resolution and when it scales for a larger image the nodes see a smaller part of the image. Now the following paper confirms this (see section 6.2) and states that VGG19 works best with 512x512px images. With this size, the style features are best reserved. The paper also proposes a method to overcome this problem.

https://arxiv.org/pdf/1611.07865v1.pdf

jcjohnson commented 7 years ago

@htoyryla You are exactly correct. I've also read the linked paper, and it's very easy to implement their multiscale generation in neural-style. I've been playing around with it for the past few days, and it works really well for generating high-resolution outputs. Here's an example:

out4

This image was generated with this script:

https://gist.github.com/jcjohnson/ca1f29057a187bc7721a3a8c418cc7db

Where the style image is a super-high res scan of Starry Night from the Google Art project:

https://upload.wikimedia.org/wikipedia/commons/e/ea/Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg

crasse2 commented 7 years ago

wow ! that's an awesome method here ! so basicly it generates the main stylisation structure on the first small output, which lead to larger graphic interpretation of large shapes from the source image and then "repaint" it on larger scales with more details from the style image while keeping the stylisation structure from the small one ! I've already experienced also problems with large outputs (giving homogeneous results with swarming details and the loss of the main graphic structure of the source image) So thanks for sharing this, I have to try it :D !

jcjohnson commented 7 years ago

@crasse2 Yep, that's the right way to think about it.

With single-scale generation, the absolute scale of style features in the style image will be the same as the absolute scale of transferred style features in the generated image. Multi-scale generation allows you to decouple these; by tuning the multiscale schedule, you can select features from the style image at an arbitrary scale, and render them at an arbitrary scale in the generated image. The other benefit is speed: you don't need to run many iterations at the high resolutions. The example I posted takes about 5 minutes to generate on a Pascal Titan X.

For fun, here's another example using Compostion VII as the style:

out4_composition

xpeng commented 7 years ago

Great improvement! can this Multi-Scale tech be used on fast-neural-style?

crasse2 commented 7 years ago

gorgeous result here ! awesome

jcjohnson commented 7 years ago

@xpeng There is a paper on arXiv from a few days ago explaining how to apply this multi-scale tech to fast-neural-style:

https://arxiv.org/abs/1612.01895

I haven't implemented it yet, but I'd expect it to give similarly good results.

xpeng commented 7 years ago

@jcjohnson , oh thanks! works explosion!

jcjohnson commented 7 years ago

I'm happy to announce that neural-style now supports multiple GPUs!

The multi-GPU implementation splits the VGG loss network so that different layers are computed on different GPUs; this allows you to process very large images. On my desktop with two Titan X, I can create images up to about 2800px in width; on a server with four Titan X I can create images up to about 3600px in width. Here's an example:

starry_stanford_bigger

To use multiple GPUs you pass a comma-separated list to the -gpu flag, and also give a flag -multigpu_strategy giving a comma-separated list of layer indices at which to split the network.

To achieve this I've also implemented a couple of other fixes, including a more memory-efficient method of computing Gram matrices and exposing the nCorrection parameter of the L-BFGS algorithm as a command-line flag.

Getting good results at high resolutions is a bit tricky: you need to use multiscale generation, and for multigpu you may need to tune the -multigpu_strategy flag for your workload; in addition for the very highest resolutions you need to use a small value for -lbfgs_num_correction to avoid a ton of memory usage in the L-BFGS cache variables; you also need to disable cudnn autotuning since it uses a ton of auxillary memory.

To demonstrate these tricks, check out this script:

https://github.com/jcjohnson/neural-style/blob/multi-gpu/examples/multigpu_scripts/starry_stanford.sh

which was used to generate the 3620px Starry Stanford result on a quad Titan X server.

ProGamerGov commented 7 years ago

@jcjohnson With the new update, are the Spatial Control, Colour Control, and Luminance-only transfer features described by the, "Controlling Perceptual Factors in Neural Style Transfer" research paper, also added into Neural-Style? Or just support for multiple GPUs?

jcjohnson commented 7 years ago

This update only adds multi-GPU support; the example script shows how to use neural-style for scale control as described in the paper.

ProGamerGov commented 7 years ago

@jcjohnson Using the exact same settings, it seems there might be an issue with the latest update.

The older Neural-Style before the "multi-gpu" update:

And the current version of Neural-Style:

It completely ignores the face now and the background becomes patches of fading style. The style image can be found here. The content image can be found here.

Did any default settings change? The new output kinda reminds me of Fast-Neural-Style, so maybe it's something relating to changes that made Neural-Style have similar code to Fast-Neural-Style?

jcjohnson commented 7 years ago

@ProGamerGov That does not look good. I can try to investigate if you send me the settings, style and content image.

bododge commented 7 years ago

I'm getting a slew of nil value messages before the script runs when using nn backend. The script then runs and produces images and has usual output but the images are not very styled, they just look sort of blurry - I'm assuming because it's not calculating style/content loss correctly.

ProGamerGov commented 7 years ago

@jcjohnson @bododge I was also getting nil values when I made the "current version of Neural-Style" output image. Only I was using -cudnn_autotune, -backend cudnn, and -optimizer adam.

I tested many different variations of commands for Neural-Style, and I can rule out an accidental change to one or more of them. So I'd agree with @bododge's reasoning that the issue exists with the style and/or content loss calculations.

I think this commit might be the cause: https://github.com/jcjohnson/neural-style/commit/ea75cbc5ba196055c0ac2fe5d9960efe33cbbf05

The commit's description:

1. Use modal content and style loss modules similar to fast-neural-style
   for cleaner logic around network setup.
2. More memory-efficient Gram matrix implementation similar to
   fast-neural-style.
3. Multi-gpu support! Use nn.GPU decorators to compute different layers
   of the loss network on different GPUs.
Grume commented 7 years ago

@jcjohnson thx for awesome work!

"On my desktop with two Titan X, I can create images up to about 2800px in width; on a server with four Titan X I can create images up to about 3600px in width"

Why additive 2 titans gives gain only 800 pixels?

jcjohnson commented 7 years ago

@ProGamerGov @bododge The nil values are a bit of debug printing that I forgot to remove; those are not a problem.

@ProGamerGov I'm having a hard time reproducing your results. Running your provided style and content images on the current master, I get this:

out_new

The exact command is this:

th neural_style.lua -content_image content.jpg -style_image style.jpg -gpu 0 -backend cudnn -cudnn_autotune -seed 0

Running the same command on this commit https://github.com/jcjohnson/neural-style/commit/197ad4218f0536b3b284ea322a190d507f4d8757 gives the same result:

out_old

Can you post an example of a command that gives different results now?

BTW you can check out the old version of the code like this:

git checkout 197ad4218f0536b3b284ea322a190d507f4d8757

And you can switch back to the current master like this:

git checkout master
jcjohnson commented 7 years ago

@Grume With perfect multi-GPU scaling, you'd expect to process twice as many pixels using four GPUs as with two GPUs; however in practice you don't get perfect scaling because there is overhead in using multiple GPUs; some values need to be duplicated on multiple GPUs, and there may not be a way to partition the network such that each GPU's memory is fully utilized.

Nevertheless, the output with two GPUs is 2800 x 1475 = 4130000 pixels and with four GPUs we get 3620 x 1905 = 6896100 pixels; thus with four GPUs we get 1.67x as many pixels as with two GPUs. This certainly isn't perfect scaling, but it's not actually too far away.

bododge commented 7 years ago

@jcjohnson I was able to reproduce your test results as above, and got identical images as yours in both tests, but I am seeing a problem when using -normalize_gradients flag. When I use it on the newer build it's almost like the photo is being used as the style image. Here is a well documented imgur album with commands under each result http://imgur.com/a/h9FRL

jcjohnson commented 7 years ago

Thanks for the test case @bododge! I accidentally introduced a bug with the -normalize_gradients flag; it should now be fixed in https://github.com/jcjohnson/neural-style/commit/e1ff8fa45dded440ede9e6cc084c3334e3859f67. I'm really sorry about that!

bododge commented 7 years ago

It's working great now, thank you!

bododge commented 7 years ago

@jcjohnson -init_image is a new feature right? I tried it on the old version and it didn't work. I'll have to implement this into my video script. Do you think it will help with optical flow?

jcjohnson commented 7 years ago

@bododge Yes -init_image is relatively new. I think that this will likely help with video processing.

ProGamerGov commented 7 years ago

@jcjohnson Looking at things more closely at my experiments for trying to determine the case, it does seem that the -normalize_gradients issue was the cause of the issue with the output image. Not sure how I missed that, as it is really noticeable when I look closely at my test outputs.

The latest version of Neural-Style with the -normalize_gradients issue resolved, now properly again!

Thanks for the fix!

ProGamerGov commented 7 years ago

Does the new -init_image feature work in tandem with -init image/random? Or should only one of the two be used at a time?

jcjohnson commented 7 years ago

@ProGamerGov Yes if you want to initialize from a specific image then you need to pass -init image and give the path to your initialization image in -init_image.

bododge commented 7 years ago

@jcjohnson The -init_image function did help out the video processing but not at all how I thought it would work. The -init_image seems to have a more profound effect on the result than the content image itself in many cases. When I set it up to initialize on the previous frame, the result was mostly staying still and not creating animation. I ended up setting the -content_image as the previous frame instead and using the -init_image to start with the current video frame, it took me a little bit to realize that was a viable option. I think the coherency is notably improved than without. I will likely setup a more purist example to show the effect this technique is having, I'll share that when I have it.

jcjohnson commented 7 years ago

@bododge Thanks for sharing - your video results are looking good! One idea that might improve your results is to use optical flow to warp the stylized result from the previous frame, and use the warped result as the -init_image. This idea is described in Section 4.1 of this paper.

bododge commented 7 years ago

I have set optical flow up with larspars example and actually have the option to use it within the same script, but the results are only good on low motion movements. If you want a higher rate of motion like someone running, the optical flow available with that script is limiting, things turn out smeared. I wish I could get manuel ruder's version working on my setup, but deepflow doesn't run on mac, and I don't know enough about clangomp or similar to get it all working. I'm just a video editor, not a dev! :D

ProGamerGov commented 7 years ago

@jcjohnson I made a side by side comparison of the effect of using -int_image previous_image on a script built for creating a "zooming" effect with Neural-Style. The new command seems to produce less "muddy" spots. It's a pretty simple script and does not have any optical flow.

Link to the video showing the test: https://www.youtube.com/watch?v=J3ERZoE61nw

I used this content image and this style image, for the style and content image in the video.

Grume commented 7 years ago

@jcjohnson What is the image area can be obtained from 2 gtx 1070 with 16 GB of memory? Your opinion

ProGamerGov commented 7 years ago

First I quickly made a 512px image with 1500 iterations:

Then I used that image at the value for the -init_image command. I used the exact same command, including the same content image (It's a CC0 image I found) and style image. The image below is at 50 iterations and it had an -image_size value of 1000:

The look at iteration 50 did not change at all really by iteration 400. Without using the init_image command, this is what it looks like at 50 iterations and 400 iterations.

It looks like the init_image command can be used to significantly speed up the process of slowly increasing the -image_size value while constantly running the output through Neural-Style again.

bododge commented 7 years ago

@jcjohnson Will a single Titan X Pascal outperform a single Titan X Maxwell in terms of output resolution? My guess is no, since its the same amount of vram, but I'd like to hear it from you before I buy another maxwell card.

jcjohnson commented 7 years ago

@bododge Titan X Pascal and Titan X Maxwell should not differ in terms of output resolution; resolution is constrained by memory and both have 12GB. However a Titan X Pascal will obviously be a lot faster.

Also you don't have to use identical GPUs for multi-GPU processing; in my desktop for example I have a Maxwell Titan X and a Pascal Titan X. However if you want to use SLI for gaming then you need identical cards.

bododge commented 7 years ago

@jcjohnson Thanks for confirming. Since I'm only really interested in increasing resolution, I'm going with another maxwell for now. Appreciate it.

dovanchan commented 7 years ago

Do you know how can I get a result like Dreamscope,I think that's implement is wonderful for me..It just spend 30s,can you share the parameters ,I think there is a great way to approach the Dreamscope's result in 1 mins.

dovanchan commented 7 years ago

here is the website of dreamscope。I think they had done well.

https://dreamscopeapp.com/

dovanchan commented 7 years ago

image @ProGamerGov Hey,in this result you made,Can you tell me what parameters and command you used?