jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.71k forks source link

Initialize random numbers generator at start #408

Open VaKonS opened 7 years ago

VaKonS commented 7 years ago

When model is being loaded / rebuild, Torch's random numbers generator is not yet initialized with manual seed, which makes always random images with NIN ImageNet model, even with "-seed" option.

Because NIN ImageNet model uses dropout layer with random values.

Placing RNG initialization at start allows repeatable results (with manual seed) with NIN ImageNet model, like with VGG models.

htoyryla commented 7 years ago

Good find. Makes me think further that a dropout layer probably does not do any good in style transfer. Shouldn't we omit them altogether like I did in neural-mirage by adding

 if layer_type ~= "nn.Dropout" then

here https://github.com/jcjohnson/neural-style/blob/master/neural_style.lua#L130

Perhaps this is the reason that NIN has been considered to give poor results?

An alternative to removing the Dropout layer(s) would be to call evaluate() for the model after rebuilding it; this sets train = false for the model. I checked the code of Dropout, there train is initialized to True.

PS. Made quick test with and without the Dropout layer. Did not use manual seed so the results are not exactly comparable. The effect of the Dropout layer is not so drastic. In fact, what it does is masking out features randomly, thus a bit like masking of channels we experimented with recently.

NIN, No dropout, content_layers relu0,relu3,relu7 style_layers relu0,relu3,relu7 style_weight 1e5 image_size 960 out

NIN, dropout, same settings nin_dropout

VaKonS commented 7 years ago

@htoyryla, this dropout layer seems to be a way to additionally randomize an image.

When process is initialized with source image (not random noise):

Here is an example with and without dropout layer (NIN ImageNet, L-BFGS optimizer), at 150, 600 and ~2400 iterations. Some features appear without dropout, other features vanish. It looks like dropout doesn't affect optimization speed or improve the image, it simply makes different variation of style:

drop-nodrop_1024

htoyryla commented 7 years ago

I know. The Dropout layer is a random mask. Usually used for training to prevent overfitting. That is why I suggested that leaving it out might improve the quality when using NIN or any other model with Dropout between conv layers.

Then, as I also noted in my previous comment, using the Dropout layer produces variations for the same reason as using selected channels only which I experimented with recently (as Dropout masks away random outputs in each channel).

So I was not arguing against using Dropout really; both using dropout and leaving it out makes sense. And so does adding a dropout into a VGG model. As long as one can choose.

PS. Torch.nn now also has a SpatialDropout layer, might be interesting to try how it would affect the results.

PPS. I had a wrong impression how the SpatialDropout works... "extends this dropout value across the entire feature map" might not be a good idea here.