Link to application (Not functional currently as Heroku discontinued free product plans)
The application will be able to predict and solve handwritten mathematical equations from the given image. The system should be capable of solving expression involving arithmetic operations (addition, subtraction, multiplication, division) and solve equations of any degree (linear, quadratic, cubic and so on).
Key AI concepts used include OCR (Optical Character Recognition) and CNN (Convolutional Neural Networks). OCR is used to preprocess the image and segment characters, while CNN is used to predict the characters.
Download the repo, move to the folder and run
The repository can either be cloned or downloaded as a zip.
Run npm install
inside the project directory to install all dependencies.
Both ReactJS and FastAPI have to be executed :
Start ReactJS using:
cd frontend
npm run start
Start FastAPI using:
cd api
uvicorn app:app --port:8000 --reload
Build and run both the Frontend and API using:
docker-compose up --build
The frontend can be viewed at http://localhost:3000 and the API can be viewed at http://localhost:8000
The Frontend part has been developed using ReactJS
. Here the user enters the image either by uploading or by using the sketchpad. The image is encoded to base64 format and sent to the REST-API as a POST request.
The REST-API has been implemented using FastAPI
. The request data is decoded and saved as an image locally and this image is sent to the backend where the equation is predicted and solved. The image processing is done using Tensorflow and OpenCV. This process can be viewed as two separate modules : Equation Prediction and Equation Solver.
OpenCV is used to perform binarization and line and character segmentation. A Tensorflow model trained using the EMNIST (Extended MNIST) dataset is used to predict each of the segmented characters and the equation generated is passed as a string to the Equation Solver.
The Equation Solver solves the mathematical equation and passes it back to the Frontend where it can be viewed.
The major steps include : Noise Removal, Binarization, Thresholding and Image Segmentation.
The Binarized image and the segmented images can be viewed below :
After each of the character in the image is detected, the string containing the equation is passed to this final module which solves the equation or mathematical expression.
The equation can be of two types :
A mathematical string such as ‘5+3’ or ‘66x3+2’ (String that is input to this module is of this format). This string can either be evaluated using a custom-built function or the eval( ) function in python.
A mathematical equation of any degree. The string ‘X2+5=0’ is interpreted as X**2 + 5
since the 2 appears after the variable. Whereas 2X+5=0
is interpreted as 2*X + 5 = 0. Since prediction of even a single character leads to incorrect results/failure, simple replacements are performed on the given string to increase accuracy. These include Z -> 2, G -> 6, B -> 8 and D -> 0. The equation is solved using the SymPy library, which is a python library for symbolic computation.
The 2 types of equations are distinguished by checking if the equation contains ‘=‘. If the equation contains ‘=‘, it is interpreted as the 2nd type, otherwise it is interpreted as the 1st type.
We enthusiastically welcome contributions, pull requests are most welcome! Your input is invaluable, and we appreciate any contributions, whether they are major changes or minor enhancements.