johndun / mnist-torch7

0 stars 1 forks source link

Saving pre_params in train.lua #1

Open harkmug opened 9 years ago

harkmug commented 9 years ago

In line 27 of train.lua, should torch.save('models/preprocessing_params.t7', params) have pre_params instead of params? I was trying to run predict on the trained model for a new dataset (the Kaggle MNIST Digit Recognition challenge). Are you thinking of importing the predict.lua code from here. Thanks for the code!

johndun commented 9 years ago

Definitely should be pre_params. I had spent most of my time working with the code in validate.lua.

I wasn't aware of a Kaggle MNIST dataset. I'd imagine you'd want to do a bunch of jittering to place on the leaderboard.

I'll have to check it out. Thanks.

On Thu, Jan 8, 2015 at 2:36 PM, harkmug notifications@github.com wrote:

In line 27 of train.lua, should torch.save('models/preprocessing_params.t7', params) have pre_params instead of params? I was trying to run predict on the trained model for a new dataset (the Kaggle MNIST Digit Recognition challenge). Are you thinking of importing the predict.lua code from here https://github.com/nagadomi/kaggle-cifar10-torch7/blob/cuda-convnet2/predict.lua. Thanks for the code!

— Reply to this email directly or view it on GitHub https://github.com/johndun/mnist-torch7/issues/1.

harkmug commented 9 years ago

Thanks, John. I got a predict function working and ran the Kaggle MNIST test set through the "setup_mlp" model (changing the input image size to 28 x 28 before training). It made it into the top 10 with score of 0.99743! Next up, I'll try the convnet1.

(This is raghu, btw)

johndun commented 9 years ago

Did you train using the data from torch7 demo? 0.99743 is too high for a simple MLP model. I have to do some crazy stuff to get to 0.9965 on the standard train/test split. See here. I think Kaggle must have shuffled the test set around.

harkmug commented 9 years ago

Yes, it is cheating a bit. Kaggle split the MNIST data set into 42,000 training and 28,000 test. But I did not use their training data as you had the full mnist data already loaded as part of this demo. I just used the trained model to predict on the 28K test set. Presumably, many of the same images were already in the demo training data used by the model). Hence the high scores for all participants (Most of the 99.5+ folks must be using the full mnist dataset not just the 42K provided). I was just trying this out as a way to get familiar with the demo code.

johndun commented 9 years ago

I created a new repository with some better code for fitting and testing against the Kaggle version. The same 1 hidden layer model gets to 0.97957 on the public leaderboard test set (127th position when I submitted) when only trained on the 42k samples in the Kaggle training set.