Open JasperSnoek opened 10 years ago
Yes, I also agree on this. Currently it's doing a N binary class prediction. I think this is an important to fix.
Binary class prediction has its merits as well - if final dataset, that you are going to apply your prediction on, has inputs that don't belong to any class (i.e. letters A, B, C), then using binary logistic units allows you to set threshold and not classify these at all. When using Softmax, all probability must be allocated between classes, even if their scores are very low.
But I find @JasperSnoek's comment interesting, that backpropagating through Softmax results in better model. Why would that be? Would it make sense to train using Softmax, but use binary logistic units at prediction time?
Thanks for putting this codebase together, I think it can be very useful for MATLAB users that want to play around with deep nets. I noticed that multiclass predictions are being passed through multiple logistic functions. See e.g. https://github.com/rasmusbergpalm/DeepLearnToolbox/blob/master/CNN/cnnff.m#L37
This is technically incorrect unless you actually want to be able to predict multiple classes at the same time. Technically, you want to output a multinomial (one of N) distribution rather than N binomial distributions (a multi-class prediction instead of N binary class predictions). What you want to backpropagate through is the Softmax function: http://en.wikipedia.org/wiki/Softmax_function which generalizes the logistic to multiple classes. It normalizes the output distribution such that it's a proper distribution. Backpropagating through this will result in a much better model.
Best,
Jasper