Question: How does this avoid pixel values outside of range

jcjohnson / neural-style

Torch implementation of neural style algorithm

MIT License

18.31k stars 2.7k forks source link

Question: How does this avoid pixel values outside of range #433

Open ghost opened 6 years ago

ghost commented 6 years ago

I was wondering how this avoids having the optimization steps push the image pixel values outside of the valid rbg range

edit: i was playing with implementing my own and noticed that a clamp inside the optimization caused the loss to explode, reason why im asking

htoyryla commented 6 years ago

What range did you clamp to? Neural-style reads images in as 0..255 range and then subtracts the mean pixel values {103.939, 116.779, 123.68}.

ghost commented 6 years ago

followed this tutorial http://pytorch.org/tutorials/advanced/neural_style_tutorial.html so 0-1, but vgg they reference is trained 0-1

if you leave that clamp in and train past the default iteration count, the loss goes nuts. I think this is an effect of the clamp not being part of the adam optimizer step

edit: is it enough to subtract from image net mean and just clamp result? Is that all this does?

htoyryla commented 6 years ago

Are you talking about this neural-style by @jcjohnson or the pytorch neural-style tutorial? Here, clamping to 0..1 makes no sense, as the pixel values probably range between -100 and 100 and even more.

The pytorch code is different. I've seen the tutorial, the explanation of the process is good but don't know if anyone here is so familiar with that implementation.

ghost commented 6 years ago

Ah, sorry for being unclear, I meant the clamping in the neural style tutorial.

So it seems that this neural style subtracts the data set mean from the incoming images, doesn't clamp during the optimizations, and when saving, adds back the mean and then clamps between 0 and 255, correct? No oddities like regularizing the pixel values or anything?

htoyryla commented 6 years ago

So it seems that this neural style subtracts the data set mean from the incoming images, doesn't clamp during the optimizations, and when saving, adds back the mean and then clamps between 0 and 255, correct?

That's how it looks like from the code, except that there is a TV regularization layer at the bottom of the network. You can adjust the regularization by -tv_weight, and remove the layer by setting this to 0.