A full stack React/JavaScript and Python/Django web application that recognizes handwriting and converts it into text, by incorporating multiple machine learning models that were pre-trained using the EMNIST Dataset on Kaggle. These neural network models recognize all digits, all uppercase letters, and all lowercase letters that are visibly different from their uppercase counterparts.
The models were trained on the following characters: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabdefghnqrt
To account for these "left out" lowercase letters that look like their uppercase complement, the final prediction for these characters are converted into lowercase if the character is drawn less than half the height of the canvas. For "tall" versions of these lowercase characters, klpy
, these characters will be converted into lowercase if their heights are less than 70% of the canvas height.
The best independent model used inside of this application is more accurate than the rest of the models created by Kaggle users who use Tensorflow/Keras. To extend onto this - when this model, a similar model, and 3 other sub-optimal models (due to Heroku limitations) are combined, accuracy increases another 0.5%
.
The Jupyter Notebook inside this repo describes how the neural network models were created for this web application. It goes step by step: from acquiring the outside dataset for learning to Heroku deployment.
0-9, a-z, A-Z
(62 characters)Website: Live Heroku App
POST
request to Django.cv2
.space_location
.char_img_heights
.final_prediction
.
0
through 46
which corresponds to the index of the 47 characters that each model was trained on. (Ex: an output of 17
corresponds to H
in the mapping).char_img_heights
. This decision will be performed on the images "y", "y", "o" and "u". The letter "y" gets a special constraint because its height is larger than the average lowercase letter.space_location
, a " "
is appended to the final result. In this example, space_location
will have [2]
signaling that there's a space after "y" - which will give us a "Hey "
at the end of the first "y" iteration.final_prediction
to React with "Hey you"
, and React displays the result on the client.After a prediction has been decided by the neural network, I personally try to be as hands-off as possible when it comes to manipulating these results.
The current prediction manipulations I use are:
0
and the character is drawn quite small, the prediction is manipulated to a lowercase o
0
will be read as an o
, much like the manipulation of uppercase O
I left in commented code where, if either characters 0
or O
were predicted, the final prediction is dependent on the ratio of height/width of the character image. If a user writes a fat circle, the result will be a capital or lowercase O
; if a user writes a narrow circle, the result will be the number 0
.
For determining "i" vs "I" (another issue with the EMNIST dataset), one could cook up some code during the cv
portion and determine if a character has a hovering dot. One could do a better height estimate for casing by taking the total character height and negating the space between the dot and the base of the "i".
git clone https://github.com/MikeM711/Deep-Learning-Handwriting-Recognition.git
cd Deep-Learning-Handwriting-Recognition
npm install
sudo -H pip install pipenv
pipenv shell
pip install -r requirements.txt
npm start
python manage.py runserver
Toubleshooting
(Deep-Learning-Handwriting-Recognition)...