Open knkski opened 7 years ago
https://cambridgespark.com/content/tutorials/neural-networks-tuning-techniques/index.html
This post mentioned some about what were talking about last time, using He_nomral kernel initializer with relu activation, data augmentation, . Their model training on MNIST (at end of the post) achieved 99.47% accuracy on the testing data. Maybe something we could try?
Found a architecture, SimpleNet: https://github.com/Coderx7/SimpleNet
Their benchmarks show that it preform pretty well, even better than many complex architectures across different image recognition dataset (including MNIST), while it uses fewer parameters.
The corresponding paper, https://arxiv.org/pdf/1608.06037.pdf, introduces their design in detail, also including some tips for fine tuning CNN. Good to read if you guys are interested.
Some interesting things stand out to me:
Since they only offer a Caffe version, I "translate" it into Keras: https://github.com/knkski/atai/blob/master/train_SimpleNet.py However, I have not tested yet, if any of you guys are able to run it (also debugging...) would be great! Or we could just pick some pieces and transplant into our model.
Thank you, Yekun
I can answer the zero padding question. Basically, each layer downsamples the image (particularly maxpooling). Since we don't have very large input images, The images can quickly get downsampled to a 0x0 pixel image, which isn't useful. zero padding helps prevent that.
Unfortunately, it looks like a naive implementation of simplenet doesn't perform as well as vggnet:
It's not far off, though. I'll see if I can tweak the parameters and make it perform any better
Right now we're using a variant of VGGNet, which is giving decent results. However, we should investigate alternatives such as AlexNet. We should also investigate how well an actual version of VGGNet works, although this is blocked by #1, due to GPU memory usage.