Setting learning rate initialisation and schedule using a small subset of training data

matt-graham commented 9 years ago

This book chapter suggests learning rate schedule and initialisation can be validly set using a smaller (representative) subset of the training data. We should therefore implement some way of specifying to use only a subset of data in run settings (e.g. specify ratio example to use over total available) to allow quicker experimentation to identify a good learning rate schedule and initialisation.

gngdb commented 9 years ago

This could be done by modifying train_test_split in utils to have another string key that splits out a part (20%?) of the training set.

On 11 March 2015 at 14:07, Matt Graham notifications@github.com wrote:

This book chapter http://link.springer.com/chapter/10.1007/978-3-642-35289-8_25/fulltext.html#Fig6 suggests learning rate schedule and initialisation can be validly set using a smaller (representative) subset of the training data. We should therefore implement some way of specifying to use only a subset of data in run settings (e.g. specify ratio example to use over total available) to allow quicker experimentation to identify a good learning rate schedule and initialisation.

— Reply to this email directly or view it on GitHub https://github.com/Neuroglycerin/neukrill-net-work/issues/66.

matt-graham commented 9 years ago

Setting 'training_set_modetotest` in training data set definition works for doing this (even if it is a little bit hacky).

Neuroglycerin / neukrill-net-work

Setting learning rate initialisation and schedule using a small subset of training data #66