The following is a TODO list for the Convolutional neural networks for artistic style transfer article.
So what is Prisma and how might it work?
[x] Image: A three part image, showing the content image, the style image and the style-transferred image generated by Prisma. (Needs to be made from scratch.)
[ ] Add c, s, x to the initial images describing the style transfer effect
[ ] Tie alpha and beta with the variables introduced in the code later
Convolutional Neural Networks from the ground up
The image classification problem
[x] Image: The image classification problem.
[ ] Redo current placeholder making "white" transparent
[x] Merge or connect the following two thoughts better:
It is my hope that by starting our journey at a fairly basic place and gradually stepping up in complexity as we go along, that you get to learn something interesting no matter what your level of expertise.
Convolutional Neural Networks from the ground up
This section offers a brief summary of parts of the Stanford course Convolutional Neural Networks for Visual Recognition (CS231n) that are relevant to our style transfer problem. If you’re even vaguely interested in what you’re reading here, you should go take this course. It is outstanding.
Our first learning image classifier
[x] Merge the content describing supervised learning in a better way with the following content.
A linear score function
[x] Editing: Try to polish the last paragraph.
Softmax activation and cross entropy loss
[ ] Image: An image classifier showing the score function and the loss function. (Redo current hand-drawn image.)
[x] Writing: Note here that the optimisation problem is not well posed, so we need a regularisation term to constrain the parameters search space.
[x] Writing: Conclude with the full loss function (specifically arriving at the form used in the basic TensorFlow MNIST tutorial).
An iterative optimisation process
[x] Image: A figure illustrating gradient descent (in two dimensions?).
[ ] Writing: Describe the math behind (minibatch) SGD; decay learning rate over the period of the training. Introduce a subsection? Tease second order methods to motivate L-BFGS later.
(Subsection conclusion, not explicitly labelled as such)
[ ] Exercise: Introduce the CIFAR10 exercise as a natural extension to the TensorFlow tutorial. This gives a feeling for the ImageNet dataset and teaches how to feed a different kind of dataset to the classifier.
Moving to neural networks
[x] Writing: Find out the accuracy of a linear classifier on CIFAR10 and report the value.
[x] Image: Cartoon representation of the image space as a 2D plane, with the classifier being a bunch of lines. (Redo current placeholder from CS231n, incorporating more of its caption if needed.)
[ ] Link: Add a proper link to (fully-connected) neural networks subsection
[ ] Link: Add a proper link to CNN subsection
Making the score function nonlinear
[x] Writing: The whole subsection needs to be researched and written.
[x] Writing: Reiterate here that the way we wish to improve the performance of our classifier is to make it nonlinear.
[x] Writing: Introduce bias trick much earlier. Perhaps as soon as the first score function is introduced.
[ ] Image: Figure of the ReLU
[x] Writing: Introduce ReLU as a first nonlinear extension, serving as our first model of a neuron. There are many other functional forms one could use, but this one form is really popular today and will suffice for our needs.
Layer-wise organisation into a network
[x] Writing: The whole subsection needs to be researched and written.
[x] Writing: Talk about organising collections of neurons into (acyclic) graphs. This introduces the fully-connected (FC) layer. More layers allow for more nonlinearity, even though each neuron is barely nonlinear.
[x] Image: Some examples of neural networks.
[x] Writing; Note that this allows for a now classic architecture that employs matrix multiplications interwoven with nonlinear activation functions.
Some technicalities
[x] Writing: The whole subsection needs to be researched and written.
[ ] Writing: Explain how to initialise such networks.
[ ] Writing: Explain how to clean input data (subtracting average of channels).
(Subsection conclusion, not explicitly labelled as such)
[x] Writing: Offer some conclusions on NNs in general.
[x] Exercise: Setup a simple exercise in TensorFlow. The point is to try to improve upon the linear image classifier we had earlier (2-Layer fully connected network + softmax on CIFAR10/MNIST).
[x] Writing: (Evaluate if this is needed.) You now know enough to extend the one step MNIST tensorflow tutorial into multi-layer and try it out. Note that your accuracy on MNIST goes from ca 92 to 97.
And finally, convolutional neural networks
[x] Writing: Polish this introduction. Feels a bit drab.
[x] Writing: Add actual numbers for the number of parameters needed for the multi-layer fully-connected neural networks
Architecture of CNNs in general
[x] Writing: Polish this subsection.
[ ] Writing: Work in additional notes from last pages of the notebook.
[ ] Image: Figure of the differences between standard and convolutional neural networks.
Convolutional (Conv) layer
[x] Writing: Write this section introducing spatial intuition and parameters (depth, stride and zero-padding).
[x] Writing: Parameter sharing greatly reduces the number of parameters we’re dealing with.
[x] Image: Work in animation GIF from 231n notes.
Pooling (Pool) layer
[x] Writing: Has no parameters or hyperparameters, simply reduce the computational complexity of the problem at hand. Also reduces over-fitting.
[ ] Image: Depict a pool layer in a figure.
[x] Writing: Recall that with this notation, the models we've seen so far look like the following:
Linear: Input -> FC -> Loss
NN: Input -> FC -> ReLU -> FC -> Loss
A powerful CNN-based image classifier
[x] Writing: Needs more writing to better describe VGGNet.
[x] Image: The architecture of the VGGNet family.
[ ] Writing: Note that this model has very many parameters, so it will take a long time to train, but they've shared their learnt weights, so we can transfer this knowledge over for our purposes.
[x] Exercise: Replicate CNN tutorial from tensorflow.org. Modify to add many other layers to the network (get feeling for types). Get annoyed by boilerplate code. Redo the exercise in Keras to see how much simpler it is. This is what we're going to employ henceforth.
Returning to the style transfer problem
A neural algorithm of artistic style
[ ] Writing: This whole subsection needs to be written.
[ ] Writing: Summarise the Gatys, et al. paper for the core ideas (and a sketch of the solution methodology). Needs to be done in a way that isn’t too redundant with the concrete implementation notes below.
[ ] Image: A figure showing off the algorithm.
Some technicalities
[ ] Writing: Introduce L-BFGS as a valid quasi-Newton approach to solve the optimisation problem.
Concrete implementation of the artistic style transfer algorithm
[ ] Link: Fix link to Keras implementation.
[ ] Link: Fix link to iPython notebook in Stylist algorithm set.
[ ] Writing: Start with some initial comments on why the following are needed. Notice it doesn't require a lot of packages.
Load and preprocess the content and style images
[ ] Images: Insert image output from the iPython notebook (resized to 512px x 512 px)
Reuse a model pre-trained for image classification to define loss functions
[ ] Reference: Link to Justin Johnson’s paper in the reference list. Make a Markdown link to it and use it in this section.
The content loss
The style loss
[ ] Writing: Improve the clarity of the explanation of this section, making sure to point out that the Gram matrix is a special case of something more general.
The total variation loss
[ ] Implementation: Better explain, or rewrite to look clearer.
Define needed gradients and solve the optimisation problem
[ ] Implemetation: Make this whole business of introducing an Evaluator class more clear. Or better yet, remove it entirely.
[ ] Writing: The following is not really the last iteration, but all iterations put into a GIF.
Conclusion
[ ] Writing: This whole subsection needs to be written using the skeleton in the article.
[ ] Image: Get some and examples and document corresponding hyperparameters.
General
[x] Commit images that are actually used in the article to the repo.
[x] Arrange the links in the bottom of the article in a more coherent fashion.
[x] Make sure the reference list contains all interesting articles.
The following is a TODO list for the Convolutional neural networks for artistic style transfer article.
So what is Prisma and how might it work?
Convolutional Neural Networks from the ground up
The image classification problem
[x] Image: The image classification problem.
[x] Merge or connect the following two thoughts better:
Our first learning image classifier
A linear score function
Softmax activation and cross entropy loss
An iterative optimisation process
(Subsection conclusion, not explicitly labelled as such)
Moving to neural networks
Making the score function nonlinear
Layer-wise organisation into a network
Some technicalities
(Subsection conclusion, not explicitly labelled as such)
And finally, convolutional neural networks
Architecture of CNNs in general
Convolutional (Conv) layer
Pooling (Pool) layer
[x] Writing: Has no parameters or hyperparameters, simply reduce the computational complexity of the problem at hand. Also reduces over-fitting.
[ ] Image: Depict a pool layer in a figure.
[x] Writing: Recall that with this notation, the models we've seen so far look like the following:
A powerful CNN-based image classifier
[x] Writing: Needs more writing to better describe VGGNet.
[x] Image: The architecture of the VGGNet family.
[ ] Writing: Note that this model has very many parameters, so it will take a long time to train, but they've shared their learnt weights, so we can transfer this knowledge over for our purposes.
[x] Exercise: Replicate CNN tutorial from tensorflow.org. Modify to add many other layers to the network (get feeling for types). Get annoyed by boilerplate code. Redo the exercise in Keras to see how much simpler it is. This is what we're going to employ henceforth.
Returning to the style transfer problem
A neural algorithm of artistic style
Some technicalities
Concrete implementation of the artistic style transfer algorithm
Load and preprocess the content and style images
Reuse a model pre-trained for image classification to define loss functions
The content loss
The style loss
The total variation loss
Define needed gradients and solve the optimisation problem
Conclusion
General