hughperkins / DeepCL

OpenCL library to train deep convolutional neural networks
Mozilla Public License 2.0
867 stars 199 forks source link

Output from network #34

Closed m-smith closed 8 years ago

m-smith commented 8 years ago

In a multi class problem like MNIST, how can I get the output of a softmax layer for each class (plane) using the python wrapper? getOutput returns a list of what seems to be probabilities, but what these probabilities represent is unclear to me!

hughperkins commented 8 years ago

Hmmm... been a long time since I looked at this... As far as the literal output to your question, you mean how to calculate the label given the probabilities? So it will just be whichever layer gives the maximum probability.

In the underlying c++ library though, the SoftMax layers has a method void getLabels(int *labels), https://github.com/hughperkins/DeepCL/blob/master/src/loss/SoftMaxLayer.cpp#L297 , which will do this for you. But I will need to ponder a bit how to expose this in the python layer, so you might want to just hack and paste that method into python for now, ie this code:

    for(int n = 0; n < batchSize; n++) {
        float *outputStack = output + n * numPlanes;
        float highestProb = outputStack[0];
        int bestPlane = 0;
        for(int plane = 1; plane < numPlanes; plane++) {
            if(outputStack[plane] > highestProb) {
                bestPlane = plane;
                highestProb = outputStack[plane];
            }
        }
        labels[n] = bestPlane;
    }
m-smith commented 8 years ago

Right, I figure I could do something like above. The issue is that the python wrapper's get output method doesn't provide access to all the planes....maybe just one or the max of each plane for each input.

hughperkins commented 8 years ago

Hmmm, glancing through the code (since as I say, it's been a while :-P ), it looks like getOutput should return an array that is getOutputNumElements() long, and getOutputNumElements() is an abstract method of Layer, which is overridden by LossLayer, and simply returns the number of output elements of the previous layer.

So, in theory, if I'm reading correctly, the size of getOutput should be the same as the size of the output of the previous layer.

Can you post the code that you're using to test, and I can have a play?

m-smith commented 8 years ago

I dunno what I was looking at/doing..I revisited my test cases and this looks like its working perfectly...sorry for the trouble!

hughperkins commented 8 years ago

Ok, cool :-) By the way, do you most need the predicted output label of each example, or do you most need the underlying probabilities?

m-smith commented 8 years ago

just the underlying probabilities is good enough for me, but I can defs see the use for a wrapper for the prediction...it would make it that much easier...

hughperkins commented 8 years ago

Hi Matthew, ok, you mean, probabilities are enough, because you already have the code to convert to labels? or probabilities are what you are actually trying to obtain currently, dont need the labels?

m-smith commented 8 years ago

I need both, so I do need the labels, but I am already working with the probabilities so I don't mind computing the labels myself.

hughperkins commented 8 years ago

Added in d70b598 . You can see test_lowlevel.py lines 66-68 for an example:

            lastLayer = net.getLastLayer()
            predictions = lastLayer.getLabels()
            print('predictions', predictions)