dnouri / nolearn

Combines the ease of use of scikit-learn with the power of Theano/Lasagne
http://danielnouri.org/notes/2014/12/17/using-convolutional-neural-nets-to-detect-facial-keypoints-tutorial/
MIT License
949 stars 260 forks source link

Question on batch_iterator_train / test and custom_score #183

Closed BrianMiner closed 8 years ago

BrianMiner commented 8 years ago

Please let me know if this is best in the lasagna user group. It is a question(s) and not an issue.

I am trying to make sense of what is happening when using batch_iterator_train /test and a custom score function.

My data X shape is (39785, 1727) and y is (39785,). Y is numeric (actually ordinal).

The eval function is simply this, where I put a print in to see the shape of the true y.

def kappa(y_true, y_predict):
    print(y_true.shape)
    return ml_metrics.quadratic_weighted_kappa(y_true.ravel(),y_predict.astype(int).ravel())

Here is the net setup:

layers0 = [('input', InputLayer),
           ('dense0', DenseLayer),
           ('dropout0', DropoutLayer),
           ('dense1', DenseLayer),
           ('output', DenseLayer)]

num_features = X_train.shape[1]

net0 = NeuralNet(layers=layers0,

                 input_shape=(None, num_features),

                 dense0_num_units=500,
                 dropout0_p=0.5,
                 dense1_num_units=500,

                 output_num_units=1,
                 output_nonlinearity=None,

                 update=nesterov_momentum,
                 update_learning_rate=0.0001,
                 update_momentum=0.9,
                 regression=True,

                 verbose=1,
                 max_epochs=50,

                 batch_iterator_train=BatchIterator(batch_size=5000),
                 batch_iterator_test=BatchIterator(batch_size=5000),

                 train_split=TrainSplit(eval_size=0.5),
                 custom_score = ('kappa',kappa)

                )

Questions:

1) What is the effect of having a separate batch_iterator_train and batch_iterator_test? I imagined that 50,000 records are sampled from the training set and given the eval_size that 50% of them would be for training and 50% would be used for validation in the progress print out. So, I am not sure the effect of the batch_iterator_test.

2) For the custom score, is y_true and y_predict training or validation? Can you access and print both?

3) For the custom score I printed out the shape of y_true and 1 epoch is below. Why is the function called 4 times instead of just once? Is the resulting 'kappa' printed below the average of the 4 mini-batches?

(5000, 1)
(5000, 1)
(5000, 1)
(4893, 1)
  epoch    train loss    valid loss    train/val     kappa  dur
-------  ------------  ------------  -----------  --------  -----
      1      39.74356      28.53488      1.39281  -0.01199  5.53s
dnouri commented 8 years ago

1) What is the effect of having a separate batch_iterator_train and batch_iterator_test?

I usually do data augmentation the training examples only, thus only in batch_iterator_train.

2) For the custom score, is y_true and y_predict training or validation? Can you access and print both?

You can only access the validation examples, which is what you're passed.

3) For the custom score I printed out the shape of y_true and 1 epoch is below. Why is the function called 4 times instead of just once? Is the resulting 'kappa' printed below the average of the 4 mini-batches?

Yes it's the average. Your function is called once per batch. And thus you'll get matrices of size 5000 passed into your function.

BrianMiner commented 8 years ago

Thank you! In regards to Question #1 I think I am still a bit confused.

If I change the code to (note I comment out test) :

batch_iterator_train=BatchIterator(batch_size=10000),

batch_iterator_test=BatchIterator(batch_size=5000),

train_split=TrainSplit(eval_size=0.5),

It appears that the validation size becomes the default of 128. Is TrainSplit(eval_size=0.5) just ignored and doesn't have function?

If I do this:

batch_iterator_train=BatchIterator(batch_size=10000), batch_iterator_test=BatchIterator(batch_size=9000),

does this mean that of the 10,000 records per mini match that 9,000 are used for validation and only 1,000 for training?

dnouri commented 8 years ago

It appears that the validation size becomes the default of 128. Is TrainSplit(eval_size=0.5) just ignored and doesn't have function?

Ah, I think you're confusing batch size with train/validation split. The batch size is the number of examples used in each iteration of gradient descent. It has nothing to do with the size of the train or validation set overall.