Why should we double the number of outputs when using dropout?

dnouri / cuda-convnet

My fork of Alex Krizhevsky's cuda-convnet from 2013 where I added dropout, among other features.

http://code.google.com/p/cuda-convnet/

253 stars 147 forks source link

Why should we double the number of outputs when using dropout? #5

Closed invisibleroads closed 10 years ago

invisibleroads commented 10 years ago

Hi Daniel,

Thank you for modifying Alex's code to enable Hinton's dropout.

Is it possible for you to please explain in the README why you suggest doubling the number of outputs in the last layer when using dropout?

RHH

In practice, you'll probably also want to double the number of outputs in that layer.

Does that mean if are making a simple binary classifier, then the number of outputs should be four when using dropout? How do we interpret four outputs from a binary classifier?

dnouri commented 10 years ago

It doesn't say double the number in the last layer. It says double the number in that layer -- where you add dropout.

Regularizing a net with dropout will usually allow you to make it larger compared to a network that doesn't use dropout.