karpathy / convnetjs

Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser.
MIT License
10.86k stars 2.04k forks source link

A 4x faster alternative to ConvNetJS #115

Open photopea opened 4 years ago

photopea commented 4 years ago

While extending my knowledge of neural networks, I implemented a neural network library in Javascript. It has capabilities similar to ConvNetJS, but both training and testing are 4x faster (while still running in a single JS thread on the CPU).

I did not have time to prepare such nice demos, as there are for ConvNetJS. I guess you can use ConvNetJS for learning and experimenting, and use my library when you want to train a specific network.

Also, my library can load pre-trained models from ConvNetJS (JSON) and Coffe (.coffemodel).

https://github.com/photopea/UNN.js - it is only 18 kB.

NoamGaash commented 4 years ago

does it have any advantage compared to tensorflowJS?

photopea commented 4 years ago

@NoamGaash I guess Tensorflow.js uses WebGL, as I see GPU being used while training.

At the UNN.js page, I describe creating a specific network for MNIST, and training it in a certain way. I just tried to reproduce it with Tensorflow.js

model.add(tf.layers.conv2d      ({inputShape: [24,24, 1], kernelSize: 5, filters: 8, activation: 'relu', padding:"same"}));
model.add(tf.layers.maxPooling2d({inputShape: [24,24, 8], poolSize  : 2, strides: 2}));
model.add(tf.layers.conv2d      ({inputShape: [12,12, 8], kernelSize: 5, filters:16, activation: 'relu', padding:"same"}));
model.add(tf.layers.maxPooling2d({inputShape: [12,12,16], poolSize  : 3, strides: 3}));
model.add(tf.layers.flatten({}));
model.add(tf.layers.dense({units: 10, activation: 'softmax'}));
model.compile({optimizer:'adadelta', loss: 'meanSquaredError'});
...
model.fit(xs, labels, {  batchSize:20, epochs:1, validationSplit:0 })

This takes 60 seconds for UNN.js, 240 seconds for ConvNetJS, and 52 seconds for Tensorflow.js.

However, while UNN.js and CNJS make a network, which makes around 140 errors, TF makes a network, which makes 270 errors. Even after four such iterations of TF, the resulting network made 180 errors.

So even though a training iteration of TF is a bit faster than UNN.js, the training quality is poor, and a network of a specific quality is trained faster with UNN.js than with TF.

Do you think it could be due to the fact, that JS works with 64-bit floats, while GPU works with 32-bit floats? Or I implemented the network in a wrong way?

NoamGaash commented 4 years ago

I doubt it's a matter of floating precision, as lower precision floats achives similar results on TPUs. Maybe it's about different kernel initialisation method, or some other implementation details.

photopea commented 4 years ago

I thought it is parameters of Adadelta , so I set them to the same, as are used in UNN.js and ConvNetJS demo:

model.compile({optimizer:tf.train.adadelta(1, 0.95, 1e-6), loss: 'meanSquaredError'});

But it does not train any better. If you want, try to rebuild this network in TF.js and train it to 140 errors after one loop over training data: https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html .

Basically, after four iterations of TF.js, you get as good network, as after one iteration of UNN or ConvnetJS. Four iterations of TF.js take 200 seconds, one of CNJS 246 seconds, and one of UNN 61 seconds.

chenjianan0823 commented 4 years ago

Is there a demo?

photopea commented 4 years ago

At https://github.com/photopea/UNN.js , I wrote how to recreate a network for MNIST.