result greater than 1 and false ! ( Digit recognition )

lucydjo commented 5 years ago

A GIF or MEME to give some spice of the internet

I'm starting with Brainjs, and I'd like to do digital number recognition. I have a list of 40K images of 28x28 pixels. This makes an Input of 784 values, Like this :

{ input : { pixel0: 0, pixel1: 0, pixel2: 255, pixel3: 255, pixel4: 0, ..................... } }

The value of each pixel is 0 to 255, white to blac. I also tried with an array, there is no difference in the result

{ input : [0, 0, 255, 0, 0...........] }

What is wrong?

The output result is greater than 1 and is totally wrong. However, the error rate is 0.9% during training step.

{ N1: 0.008969820104539394,
  N0: 3.631080502941586e-8,
  N4: 1.5748662463010987e-7,
  N7: 0.0687410980463028,
  N3: 0.0002567432529758662,
  N5: 0.00001275186241400661,
  N8: 0.00001598958988324739,
  N9: 4.807806703865936e-7,
  N2: 0.035766009241342545,
  N6: 0.0015280867228284478 }

How do we replicate the issue?

Code : https://gist.github.com/lucaspojo/8244dc4d733d5a053cb92b4f3bc63773 Sample of training data : https://gist.github.com/lucaspojo/9a10e4af4f48e1bc1bdae668458c5755 The entire file is 70MB, if necessary I can share it.

How important is this (1-5)?

3

Expected behavior (i.e. solution)

EDIT : I think I have totally forgotten one fundamental thing. The input values must be between 0 and 1, in my example I give it a value between 0 and 255.

I made the correction of my code, the error rate during training went from 0.9% to 0.002%.

[13:34:03] iterations: 1, training error: 0.06892831949742094 [13:34:04] iterations: 2, training error: 0.04353368171559561 [13:34:05] iterations: 3, training error: 0.03535015746493033 [13:34:07] iterations: 4, training error: 0.030196516320904053 [13:34:08] iterations: 5, training error: 0.026896815414783184 [13:34:10] iterations: 6, training error: 0.023825500711139782 [13:34:11] iterations: 7, training error: 0.021304108273752783 [13:34:12] iterations: 8, training error: 0.01930223130633866 [13:34:14] iterations: 9, training error: 0.017322111269441335 [13:34:15] iterations: 10, training error: 0.015641154504538003

But when tested, the result is always greater than 1 and is wrong.

Other Comments

More informations about dataset : https://www.kaggle.com/c/digit-recognizer/data

robertleeplummerjr commented 5 years ago

You need to normalize from 255 to between 0 and 1. Or you may be able to use a different activation than what is default, namely "sigmoid". But normalizing will allow everything to train easier, period. This is just standard practice when it comes to neural networks.

Here would be an applicable normalizer:

function normalize(value) {
  const normalized = new Float32Array(value.length);
  for (let i = 0; i < value.length; i++) {
    normalized[i] = value[i] / 255;
  }
  return normalized;
}

And an appropriate de-normalizer:

function denormalize(value) {
  const denormalized = new Float32Array(value.length);
  for (let i = 0; i < value.length; i++) {
    denormalized[i] = value[i] * 255;
  }
  return denormalized;
}

lucydjo commented 5 years ago

Thx for reply @robertleeplummerjr

In my EDIT I specified that I had done this normalization and that I had had the same result.

Testing output :

{ N1: 1.6050071272033506e-9,
N0: 0.9709343910217285,
N4: 0.000021652305804309435,
N7: 0.00009061139280674979,
N3: 0.00007016883319010958,
N5: 6.426664640457602e-7,
N8: 0.00016966099792625755,
N9: 0.000005240083282842534,
N2: 0.016301261261105537,
N6: 0.00018484082829672843 }

When in doubt, I used your normalization function for training and testing, and I still have the same problem.

I inspect the INPUT values, they are normalized between 0 and 1. I don't understand what's wrong.

robertleeplummerjr commented 5 years ago

The two items, it seems, in question are:

{ N1: 1.6050071272033506e-9,
N5: 6.426664640457602e-7,

These contain an e in them, which is scientific notation for "a very tiny number". If you run them through javascript you'll see this, for example:

6.426664640457602e-7 > 0.001 -> false
1.6050071272033506e-9 > 0.001 -> false

You can use .toFixed(n) to get a better look at them, if you aren't used to the scientific notation.

(6.426664640457602e-7).toFixed(10) -> "0.0000006427"
(1.6050071272033506e-9).toFixed(10) -> "0.0000000016"

lucydjo commented 5 years ago

Thank you for this extremely important information! Everything works fine then!

Thx !

robertleeplummerjr commented 5 years ago

Glad to be of service.

BrainJS / brain.js