zackchase / mxnet-the-straight-dope

An interactive book on deep learning. Much easy, so MXNet. Wow. [Straight Dope is growing up] ---> Much of this content has been incorporated into the new Dive into Deep Learning Book available at https://d2l.ai/.
https://d2l.ai/
Apache License 2.0
2.56k stars 724 forks source link

Deep Neural Network / Dropout regularization with gluon #527

Open theoneandonlywoj opened 5 years ago

theoneandonlywoj commented 5 years ago

In the notebook, I noticed that the accuracy is calculated as follows:

def evaluate_accuracy(data_iterator, net):
    acc = mx.metric.Accuracy()
    for i, (data, label) in enumerate(data_iterator):
        data = data.as_in_context(ctx).reshape((-1, 784))
        label = label.as_in_context(ctx)
        output = net(data)
        predictions = nd.argmax(output, axis=1)
        acc.update(preds=predictions, labels=label)
    return acc.get()[1]

I am little confused as I think during the training the test (or validation) accuracy is evaluated as if the dropout was still 0.5. I can understand simplification for the training purposes, but shouldn't the training accuracy be evaluated with the dropout and the validation and test accuracy values be 0? Would solution to that be additional parameter include_dropoutas below:

def evaluate_accuracy(data_iterator, net, include_dropout=True):
    acc = mx.metric.Accuracy()
    with autograd.record(train_mode=include_dropout):
        for i, (data, label) in enumerate(data_iterator):
            data = data.as_in_context(ctx).reshape((-1, 784))
            label = label.as_in_context(ctx)
            output = net(data)
            predictions = nd.argmax(output, axis=1)
            acc.update(preds=predictions, labels=label)
    return acc.get()[1]

Regards

Wojciech