Open happynear opened 8 years ago
I know that in the first layer, their is some relevance in the color space. However, I think that the relevance in the spatial space is more severe. Neural network model is really not good at controlling these relevance. So I guess training more bigger filters, such as 3x3 or 5x5, may work better.
Hi Feng! Thank you for the interest in this work. I am glad that you gave it a spin.
As you pointed out this proposed layer is more of an idea that seems like a good start for further development . It needs further polishing and review. You might want to compare it against other orthogonal transformation procedures (e.g.PCA, PCA whitening and ZCA whitening algorithms). Also the layer doesn't preserve the orthogonality.
In regards to the comparison, yes you were right. I was planning to remove the conv net layers from the color space transformation sampler. I updated the code. Thanks for pointing out! The accuracy is still in upper 60s.
Lastly, can you elaborate on your last comment. What do you mean by relevance?
I am sorry that English is not my first language, so there may be some error in my expression.
What the relevance (may correlation be better? I got this word from translate software) I said is about the correlation in different dimensions. Algorithms such as PCA can eliminate the correlation. The spatial correlation problem is more severe than that in the color space. But there is no good solution for eliminating the spatial correlation in an image.
Moreover, your algorithm can be extended to any layer's paramters. In the traditional CNN, the paramters are independent with the input. I think we can have a try on adding all the parameters to the output of transformation network.
There are lots of details. Note that the early layers get more deeper gradients form the loss layer, and the latter layers get more shadow gradients, I think the best way is to make all paramters to have similar depth w.r.t the loss layer. I am not good at english, I will draw a picture to explain it.
This picture is taken from tensorflow. The thin red line denotes the connection from the input to the parameters. I think we can add these connections and see what happens.
Hi Alexandros,
This work is really intesting and thoght-provoking. When I first see the title in arXiv, I wondered if this is just a 1x1 conv layer. Now I know that you have trained 9 sample-specific color transform parameters, and put them into the 1x1 conv layer.
We have neural networks that can build great non-linear systems, however, with each layers' parameters are trained linearly. Now I would like to re-consider the structure of all NN models.
Thanks for sharing your work!