Optical character recognition (OCR) for handwritten text is applicable in areas where several documents are being used in business for processing large amounts of paper documents. One example could be a traditional post operator that wants to automatically process information on the envelopes. The envelopes are usually not neatly adressed and standard OCR solutions fail in this area. A custom-made character recognition engine designed for this task might receive much better quality level.
The showcase presents two main things:
We use bitfusion.io Scientific Computing AMI
The AMI contains Ubuntu 14 along with a R installation along with CUDA drivers. Additionally we have installed MXNet running the following commands:
sudo apt-get update
sudo apt-get install -y build-essential git libblas-dev libopencv-dev
git clone --recursive https://github.com/dmlc/mxnet
Next modify config.mk by setting the following keys:
USE_CUDA = 1
USE_CUDA_PATH = /usr/local/cuda
USE_BLAS = atlas
Finally, compile mxnet with the command make –j4
01_install_packages.R
02_download_datasets.R
03_declare_mlp_model.R
04_prepare_data_iterators.R
05_fit_mlp_model.R
06_restart_mlp_model.R
07_predict_mlp_model.R
This example presents one possible usage of deep learning models for classification of images. One important problem is selection of an optimal structure for a deep neural network. This requires execution of several experiments for measuring predictive capabilities for various network topologies. Amazon Web Services comes forward to this need and offers very large GPU instances. The flagship offering is a p2.16xlarge offering 16 x GPU Nvidia TESLA K80 with a total of 80'000 GPU cores. This machine availabe from around $2.10 on AWS spot market would make it to the list of Top Supercomputers just 10 years ago.