PLEASE REVIEW CAREFULLY; mistakes here may lead to subtle errors in error estimation.
Changes:
Main (fixes #5 in a more immediately-useful way): 5% of training data is now held out, and used to estimate validation error a few times an epoch (configurable). The 5% is also configurable, but it's ~10k samples, which IMHO is more than enough to tell if we're overfitting or screwing up in any other way.
Improved fabfile.py by removing hardcoded paths. Default path settings can still be manually overridden in most commands if necessary.
Fabric task for downloading shit: fab get_data
Can now parameterize LSTM module much better. Includes option (untested yet) for bidirectional LSTMs. The number and size of LSTM layers can now be set on the command line.
Fixed random seed in trainMLP.py (partial fix for #8; still need same in evaluation code)
PLEASE REVIEW CAREFULLY; mistakes here may lead to subtle errors in error estimation.
Changes:
fabfile.py
by removing hardcoded paths. Default path settings can still be manually overridden in most commands if necessary.fab get_data
trainMLP.py
(partial fix for #8; still need same in evaluation code)