berthubert / hello-dl

https://berthub.eu/tmp/hello-dl
MIT License
71 stars 9 forks source link
deep-learning machine-learning neural-network

hello-dl

bert@hubertnet.nl - I post about this project on Mastodon as @bert_hubert@fosstodon.org. A series of associated blog posts is here.

A from scratch introduction to modern machine learning. Many tutorials exist already of course, but this one aims to really explain what is going on. Also, we'll develop the demo until it is actually useful on real life data.

Other documents start out from the (very impressive) PyTorch environment, or they attempt to math it up from first principles.


(animated version)

Trying to understand deep learning via PyTorch is like trying to learn aerodynamics from flying an Airbus A380.

Meanwhile the pure maths approach ("see it is easy, it is just a Jacobian matrix") is probably only suited for seasoned mathematicians.

The goal of this tutorial is to develop modern neural networks entirely from scratch, but where we still end up with really impressive results.

To do so, this project contains some minimalist tooling. The software may be minimal, but will showcase modern deep learning techniques that should wow you into believing that something very special is going on.

In quantum computing there is the concept of 'quantum supremacy', which is achieved when a quantum computer does something that no ordinary computer could do in the lifetime of the universe.

With this 'hello deep learning' software I'm aiming for a slightly more modest goal: show you a comprehensible amount of neural networking source code that solves a problem you'd never be able to address in fewer lines of code otherwise.

The intended audience for this project is people comfortable with programming and who want to (re)join the artificial intelligence scene from the ground up.

This project was inspired by Georgi Gerganov's AWESOME C++ implementation of OpenAI's Whisper speech/translation model.

Mission statement

Non-goals:

The idea is that after you are done with the blog posts (yet to be written) and have worked with the tools that it should all make sense to you. And then you can get to work with professional tooling and get to work.

Although this is an introduction using minimal dependencies and lines of code, we'll be covering modern subjects like momentum, data augmentation, weight decay, batch normalization, dropout, GELU, weights and bias logging, learning rate scheduling & more.

And unlike most demos, this one includes crowd sourced validation data that helps show why many deep learning projects are great in theory but disappoint in practice.

It appears machine learning projects tend to go through four phases:

  1. It works on the training data
  2. It also works on the validation data
  3. After a lot of disappointment, we get it to work on other people's real life data too
  4. Other people can get it to work on their own data as well

Almost all demos declare victory after phase 2. This tutorial aims to achieve the final phase.

Finally, we'll also spend time on how to detect that a neural network is not delivering a confident result.

Acknowledgments

Despite doing some neural network experiments in the 1990s, I'm extremely new to the modern stuff. The following projects/sites have been very useful in catching up:

# Status Pretty rough, but showing signs of doing something real. The code implements two small but pretty nice autograd systems, one simple and slow, the other slightly more complicated but fast. In `tensor-relu.cc` you can find a ~150 line total computer program that learns to recognize handwritten digits in a few seconds (90% accuracy so far). We do have a few dependencies: ```bash sudo apt install cmake libeigen3-dev libsqlite3-dev g++ libz-dev ``` This works for Debian derived systems, but cmake, Eigen and SQLite are universally available. Here is how to install and run the first sample: ```bash git clone https://github.com/berthubert/hello-dl.git cd hello-dl cmake . make -j4 wget http://www.itl.nist.gov/iaui/vip/cs_links/EMNIST/gzip.zip unzip gzip.zip ./tensor-relu ``` The result is nice: ``` Start! Have 240000 images ... Percent batch correct: 32.8125% Average loss: 2.1219. Predicted: 8, actual: 8: We got it right! Loss: 1.89751, -2.44847 -2.54296 -2.43445 -1.92324 -3.07031 -2.50595 -2.27749 -2.1481 -1.89751 -2.26366 .****** *XXXXXXXXX. .XXXXX**XXXX XXXX. XXX. XXX* *XX* .XXX. .XX* XXX. *XX. XXX. XXX *XXX. XXXX *XXX. .XXX* .XXXXXXXXX* *XXXXXXX XXXXXX. .XXXXXXX. .XXXXXXXXX .XXX* XXX* *XXX XXX* XXX* XXX* XXX. *XX* XXX* XXX* *XXX*.*XXXX* *XXXXXXXXXX. .XXXXXXX. ***** ``` This software is meant to accompany a series of blog posts introducing deep learning from the ground up. That series hasn't started yet as I am still figuring out how this stuff works. A more modern convolutional network can be found in [tensor-convo.cc](./tensor-convo.cc) and after hours and hours of crunching it will produce a network that you can feed to [img-ocr.cc](./img-ocr.cc). If the stars align, and handwritten 'print' letters in an image will be recognized by the network. If you want to see something cool already, take a look at [testrunner.cc](./testrunner.cc) which already shows some of the autogradient stuff. # Getting started Checkout the repository, run `cmake .` and then `make`. To actually do something [download the EMNIST](http://www.itl.nist.gov/iaui/vip/cs_links/EMNIST/gzip.zip) dataset and unzip it. There's no need to gunzip the .gz files. Next up, run `./tensor` or `./first-convo` and wonder what you are seeing. # Data https://www.nist.gov/itl/products-and-services/emnist-dataset https://arxiv.org/pdf/1702.05373v1 http://yann.lecun.com/exdb/mnist/ # Inspiration & Similar projects https://github.com/fastai/fastbook/blob/master/04_mnist_basics.ipynb Also really good: https://pytorch.org/tutorials/beginner/nn_tutorial.html [PicoGPT](https://jaykmody.com/blog/gpt-from-scratch/) [NanoGPT](https://github.com/karpathy/nanoGPT/) # Internals ~~Use `TrackedNumber` to build networks. You can also run them there if you want, but a faster way is to export them as 'Work' objects, which are a solid slab of contiguous memory.~~