kaishengtai / neuralart

An implementation of the paper 'A Neural Algorithm of Artistic Style'.
MIT License
2.41k stars 353 forks source link

Seek for help to understand more about the proposed method #10

Closed diPDew closed 9 years ago

diPDew commented 9 years ago

I'm a bit lost to understand how to train/fine-tune a deep model based on the proposed loss function in Eqn. (7) in the original paper. Is it like the training process only involves two images (one is for content and the other is for style) and fine-tune the existing model to update the weights of the convolutional layers? If so, how many iterations are needed in general to produce the final results (i.e., reconstructed images from different convolutional layers)?

Thanks very much in advance.

kaishengtai commented 9 years ago

Is it like the training process only involves two images (one is for content and the other is for style) and fine-tune the existing model to update the weights of the convolutional layers?

The weights of the convolutional layers are held fixed, so there is no fine-tuning of the model parameters. Instead, we optimize the input to the convnet (i.e. a 3xHxW tensor) to minimize the cost function given in Eq. (7) in the paper. This is done by backpropagation, using the gradient of the loss with respect to the input to the network.

how many iterations are needed in general to produce the final results

This depends on the convnet architecture, initialization, and optimization method. I find that with VGG-19, L-BFGS and random initialization, you would usually get good results by iteration 300-400 (and sometimes even sooner).

Hope that helps.

diPDew commented 9 years ago

I see. Thanks a lot!