jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.7k forks source link

Parameters for high resolution #199

Open errolgr opened 8 years ago

errolgr commented 8 years ago

Could someone please share some insight on getting the parameters right for higher resolution images like 1920 x 1080. I've been playing with the parameters and I just cant seem to get them right. My style image is 1920 x 1080, as well as my content image.

Thanks!

htoyryla commented 8 years ago

I assume your problem is that the large images do not look the same as smaller output sizes. This has been asked and discussed several times already, and to my knowledge nobody has reported how to achieve consistent scaleable results. I have my doubts if it is at all possible by merely tuning the parameters. The networks used have fixed size convolution kernels and when the image size is increased, the kernel sees a smaller part of the image. I don't have the theoretical expertise to say that this (or some similar factor) would affect the results, but my engineer's instinct tells me that it might.

errolgr commented 8 years ago

@htoyryla Yes - My large images do not look the same as smaller output sizes. Have you experimented with large images? If so, could you share your parameters, I wouldn't know where to get started. I'm running a render right now trying to experiment with different ranges however it would be nice to have a decent starting point.

charlesfg commented 8 years ago

What about increasing the style_scale ?

htoyryla commented 8 years ago

Doreets notifications@github.com kirjoitti 6.4.2016 kello 2.23:

@htoyryla Yes - My large images do not look the same as smaller output sizes. Have you experimented with large images? If so, could you share your parameters, I wouldn't know where to get started. I'm running a render right now trying to experiment with different ranges however it would be nice to have a decent starting point.

I am mainly working with smaller image sizes, around 800 px. With my current setup, the 24 GB RAM limits size to a little bit over 1000px when using VGG19 and L-BFGS (I don’t usually use ADAM as I find that L-BFGS converges easier at least for my purposes).

With nin-imagenet-conv, I can get up to 1920px and still have room for style-scaling above 1. I made some tests and posted the results here http://liipetti.net/erratic/tests/1920px-tests/ .

I started pretty much with the default parameter values and then decreased content_weight in steps to have a more pronounced style effect. Then I also tried doubling and halving the style scale.

In these tests I ran only 400 to 1000 iterations, just enough to see the effect but not necessarily to obtain the full quality.

Hannu

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

errolgr commented 8 years ago

@charlesfg I've been playing with style_scale, however it only seems to effect the style which gets transferred onto the image. The problem that I was having is that the style was barely being show in the output image. I have improved my results by tweaking the style weighting.

With nin-imagenet-conv, I can get up to 1920px and still have room for style-scaling above 1. I made some tests and posted the results here http://liipetti.net/erratic/tests/1920px-tests/ .

Interesting! These tests will help me with my baseline parameters. Hopefully I will be able to pass these parameters onto other 1920 images without having to do significant changes.

I started pretty much with the default parameter values and then decreased content_weight in steps to have a more pronounced style effect. Then I also tried doubling and halving the style scale.

I tried doing this, except I kept the content_weight the same and have been doubling the style weight to try and produce best results. I haven't really had the need to change around the style_scale yet, but I would assume I would need to change that if I was getting repeating patterns. So far I've been able to produce decent results - I'm still not happy with them however.

Take a look at my results( 700 iterations, style_weight = 2e4, content_weight = 5e3). I'd like to try and double the style weight to try and produce better results as I feel the style is pretty weak in this image.

http://imgur.com/HfCaaJA

Let me know what you think!

htoyryla commented 8 years ago

Doreets notifications@github.com kirjoitti 6.4.2016 kello 17.29:

Take a look at my results( 700 iterations, style_weight = 2e4, content_weight = 5e3). I'd like to try and double the style weight to try and produce better results as I feel the style is pretty weak in this image.

Very good image quality but yes, the effect of the style is quite small (or so it seems).

When trying to find the right weights I usually go in steps of 10x or 0.1x, until I find the right magnitude, then one can use smaller steps.

It also appears that the absolute values do matter, so increasing one is not the same as decreasing the other. One needs to experiment. I also like to print out the losses at each iteration and look at them to see what is going on.

Hannu

errolgr commented 8 years ago

That's good to know that absolute values do matter, I was under the impression that the values only mattered in relation to each other(ratio). I will give your parameters a try and get back to you.

htoyryla commented 8 years ago

Doreets notifications@github.com kirjoitti 6.4.2016 kello 22.35:

That's good to know that absolute values do matter, I was under the impression that the values only mattered in relation to each other(ratio). I will give your parameters a try and get back to you.

I am not saying that my values are any better. It was only that in the case I was testing, increasing the style weight had too much effect, and I found out that decreasing the content weight worked better. But with other images, it might be different.

Neural-style works by trying to minimize the loss, which is the sum of the content and style losses (from the selected layers) multiplied by the respective weights. The optimizer looks at how the loss is changing and adjusts the image accordingly, with the goal of finding a minimum of the total loss. It is not clear to me how the optimizer reacts to the absolute level of total loss; I could imagine however that with higher values the gradients (rate of change) will be steeper too and affect how the process is looking for the minimum.

Anyhow, it is useful to look at the loss values and how they behave, when experimenting with the weights.

Hannu

austingg commented 8 years ago

@errolgr @htoyryla Have you guys do further experiments about the absolute value of style weights and content weights? According my experiments, it seems no obvious difference when use l-bfgs optimizers.

htoyryla commented 8 years ago

Yubin Wang notifications@github.com kirjoitti 12.4.2016 kello 5.29:

@errolgr @htoyryla Have you guys do further experiments about the absolute value of style weights and content weights? According my experiments, it seems no obvious difference when use l-bfgs optimizers.

I have not done extensive systematic testing of this. I have only experienced sometimes that increasing style_weight tenfold has a much more dramatic effect than decreasing content_weight. I have also experienced that sometimes increasing both weights by the same amount helps to get the process started (from grey to a real image). And these, too, with L-BFGS.

If these effects are due to making the optimization find a different local optimum, then it may equally well be that more often there is no such alternative optimum nearby which would explain that you find no difference.

I was thinking the other day that maybe using -normalize_gradients would make the process immune to absolute value. I meant to test it, but found it difficult to get good results. Probably my weights were too far off the correct ones.

Hannu