microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.52k stars 4.28k forks source link

Eval Results #643

Closed K1T00 closed 8 years ago

K1T00 commented 8 years ago

After completing Tutorial 2 I tried to evaluate the onehidden model (CPU) on MNIST Images which I had as tif saved.

So I used the C# Client and loadet the Images (bitmap) and used the method provided here: https://github.com/Microsoft/CNTK/wiki/CNTK-Evaluate-Image-Transforms to transform them to CHW.

The results for the two Images with an "1" and a "3" are the following:

For 1:

eval1

For 3:

eval3

The results don't make any sense to me. Could someone please explain them ? Or did I make any mistakes ?

mfuntowicz commented 8 years ago

Hi Flav1u,

The node "ol" in the One_Hidden configuration is as follow :

ol = DNNLayer(hiddenDim, labelDim, h1, 1)

If you look at the DNNLayer in the file Macros.ndl :

This layer as two parameters :

Then ol.z = W * x + b (ponderate inputs with learned weights, and add the learned bias). No non-linearity function is applied on this layer.

As you can see, the outDim in the macro above indicates the number of output of your layer. For ol, the outDim value is labelDim, which means that the number of output neuron of your last layer is the number of class you're trying to predict. For MNIST, you have numbers from 0 to 9 to predict, so 10 classes.

If you give a look at your output, you will see that there is 10 lines (one line by neuron on the last layer). So the ith line is the output of the ith neuron, and, "a kind of probability" for the ith classe you try to predict. If you want to pick up the prediction made by your NN as a single scalar [0-9], you just have to take the output (~ line / neuron) with the highest value.

In addition, if you want to have real probabilities on ol.z, you can wrap ol with a Softmax function as follow :

and add the node Proba to the output nodes :

By doing so, and printing values of Proba, you will haves values in range [0, 1] representing the current probability of the input being the ith class.

Hope it's clear and it helps :).

K1T00 commented 8 years ago

Thanks a lot for the answer.

Well in this case than my model seems to not working properly.

In the first case the model predicts a "0" but it is in fact a "1" in the image. In the second case it predicts an "8" instead of a "3".

Since I used Images from the MNIST data set and not some "new" generated ones it looks like I'm doing something wrong.

mfuntowicz commented 8 years ago

You're not doing something wrong (I think :)) It's the limit of the generalization fully connected hidden layer for this model.

The next step would be to try the 01_Conv model, which will use convolutions to give better results :). Give it a try, you should see an improvement in the final prediction.

Otherwise, you can try to tune the topology, by adding a new fully connected layer and run back your training.

K1T00 commented 8 years ago

I tried the convolutional model and the results indeed improved. Now the model predicted the number 3 on the image but still couldn't predict the 1 ...

I have two questions regarding the necessary Image transformation.

  1. I used CHW, but trained the model on CPU only. Could that worsen my results? (CHW GPU only?)
  2. As far as I understand the (Brainscript) image reader is normalizing the images. For example on 8 bit images it is doing 1/256 normalization, is that correct? Do I have to normalize the images before evaluating them with the model?

For the "1" I got:

evalconv1

For the "3" I got:

evalconv3

K1T00 commented 8 years ago

Ok I now what my dumb mistake was. I had to do a NOT operation on my Images. Looks like the MNIST image set loaded into the model has black background and white numbers. Mine was white background and black numbers.

Sorry for the newb mistake :-)

Thank you anyway mfuntowicz because thanks to you I've gained a better understanding of the (Brainscript) Macro.