Open VaKonS opened 7 years ago
Good find. Makes me think further that a dropout layer probably does not do any good in style transfer. Shouldn't we omit them altogether like I did in neural-mirage by adding
if layer_type ~= "nn.Dropout" then
here https://github.com/jcjohnson/neural-style/blob/master/neural_style.lua#L130
Perhaps this is the reason that NIN has been considered to give poor results?
An alternative to removing the Dropout layer(s) would be to call evaluate() for the model after rebuilding it; this sets train = false for the model. I checked the code of Dropout, there train is initialized to True.
PS. Made quick test with and without the Dropout layer. Did not use manual seed so the results are not exactly comparable. The effect of the Dropout layer is not so drastic. In fact, what it does is masking out features randomly, thus a bit like masking of channels we experimented with recently.
NIN, No dropout,
content_layers relu0,relu3,relu7
style_layers relu0,relu3,relu7
style_weight 1e5
image_size 960
NIN, dropout, same settings
@htoyryla, this dropout layer seems to be a way to additionally randomize an image.
When process is initialized with source image (not random noise):
Here is an example with and without dropout layer (NIN ImageNet, L-BFGS optimizer), at 150, 600 and ~2400 iterations. Some features appear without dropout, other features vanish. It looks like dropout doesn't affect optimization speed or improve the image, it simply makes different variation of style:
I know. The Dropout layer is a random mask. Usually used for training to prevent overfitting. That is why I suggested that leaving it out might improve the quality when using NIN or any other model with Dropout between conv layers.
Then, as I also noted in my previous comment, using the Dropout layer produces variations for the same reason as using selected channels only which I experimented with recently (as Dropout masks away random outputs in each channel).
So I was not arguing against using Dropout really; both using dropout and leaving it out makes sense. And so does adding a dropout into a VGG model. As long as one can choose.
PS. Torch.nn now also has a SpatialDropout layer, might be interesting to try how it would affect the results.
PPS. I had a wrong impression how the SpatialDropout works... "extends this dropout value across the entire feature map" might not be a good idea here.
When model is being loaded / rebuild, Torch's random numbers generator is not yet initialized with manual seed, which makes always random images with NIN ImageNet model, even with "
-seed
" option.Because NIN ImageNet model uses dropout layer with random values.
Placing RNG initialization at start allows repeatable results (with manual seed) with NIN ImageNet model, like with VGG models.