Draw something in the canvas, and a neural network will guess what it is!
See demo
This is a NodeJS / React single page web app that presents the user with a canvas to draw on. The webapp loads the model in ONNX format and thereafter uses onnx.js to classify black-and-white 64x64 images uploaded by clients. Responses are displayed in the UI as a tag cloud and a meter indicating the activation of a single output neuron.
The Python training script train.py
is included. It uses Pytorch to train a convolutional neural network to
classify doodles. For training data it scans files from Google's The Quick, Draw! Dataset,
a data set of 50 million doodles separated into 342 classes. It generates two files: doodles.onnx
and labels.txt
.
Google uses a recurrent neural network that pays attention to the order and timing of your strokes as you draw. This one is much simpler and only looks at static images. Its size was constrained by the limited memory on the GPU. However, the model I left on S3 agrees with Google's 73% of the time, which is actually not bad for 342 classes. Still, with a lot of them, I have no clue what this thing is looking for.
You can use the Python script train.py
to create your own network, possibly with a different design.
This script will use Pytorch to train a network with two convolutional layers and three fully connected layers:
![CNN model] (https://cdn-media-1.freecodecamp.org/images/dobVrh3SGyqQraM2ogi-P3VK2K-LFsBm7RLO)
You will need to download about 20 GB of data from Google, specifically the "simplified drawing files. You can download them from Google Cloud Storage or from Kaggle which is where I got mine.
By default, train.py
will look for Google's data files within a data/quickdraw
directory in your home folder.
If you put them somewhere else, supply the path as a command line argument. It will write doodles.onnx
and labels.txt
to the project folder.
Installation requirements include numpy, PIL (Python Image Library), and Pytorch (all installable using pip install
).
If you're running Linux on a system with an RTX card, you can also install NVidia's CUDA Toolkit,
and follow their instructions for installing their Automatic Mixed Precision libary
which is a Pytorch extension for performing float-16 arithmetic on the GPU using NVidia's tensor cores. If you do this,
then initialize MIXED_PRECISION
to True
in train.py
. This will cut down on training time and loosen memory constraints.
Once trained, the network can discriminate among 344 different classes (including a"nothing" channel with all-zero samples). If you're not interested in discriminating among that many classes, remove their data files before starting the training script.
While the script is running, deleting doodles.onnx
(or doodles.pth
) will trigger a replacement with fresh versions.
This way you can see how the network is behaving at various phases of training by periodically deleting the existing
doodles.onnx
and restarting the JavaScript webapp with the new version that appears.
The network is deployed on a Node.js webapp, using Express on the server and React on the client.
In general, neural networks can be downloaded in ONNX format and run in the client's browser, but not this one-
doodles.onnx
is 300 MB and must stay on the server. Upon startup the server will load the files
doodles.onnx
and labels.txt
. If either is not present on disk, the server will download a copy
that I left in an S3 bucket. Once the server initializes the network, clients then communicate with it
using a REST API that accepts a 64x64 black and white image and responds with a list of guesses.
If you start the server without doodles.onnx
present, it will download a copy from a bucket I left on S3.
A much smaller file (labels.txt
) will also be downloaded from S3 if it is not present. These two
files are a matched set generated by the Python training script. Both are included in .gitignore
.
Clone the repository, then run npm install
, npm run build
, and npm start
. The app will listen on port 8000.
In the project directory, you can run:
npm run build
Builds the app for production to the build
folder.
It correctly bundles React in production mode and optimizes the build for the best performance.
The app is now ready to be deployed using npm start
.
npm run clean
Blows away the build
folder.
npm start
Starts the Express server listening on port 8000. (This basically means running node ./server.js
.)
If npm run build
has been executed, the build folder will be found, and its contents will be served as static files.
Otherwise the server will warn you that you need to run the webpack-dev-server
script.
npm run webpack-dev-server
.Runs the app in development mode (for React development). This needs to be run alongside npm start
and after npm run clean
.
The app will then be visible on port 3000. Pages will reload whenever you make edits.
npm run eject
This is a one-way, irreversible operation. Run this if you like dealing with Webpack files.
Contributions are welcome. If you have a nicer video card with more available memory, or a better idea for the network structure
(e.g. size and number of convolutional or hidden layers, etc.) you can edit train.py
and use it to generate your own
ONNX file that will perform better than the one I left on S3. If you do then please let me know.
The webapp can be adapted for doodling into any type of image classifier (e.g. MNIST), as long as the model is in ONNX format and accepts inputs as 1x64x64 images.