Add test.py or infer.py

amjltc295 commented 5 years ago

It would be great if there is a test.py template that could be used to infer testing data with pretrained model in saved/. Also, I notice that valid_epoch, train_epoch and the inference code seem to be similar. Maybe they could be refactored to three functions that use a common function with some more parameters (e.g, whether to do pack propagation, calculate loss ...).

What do you guys think?

SunQpark commented 5 years ago

That's a good idea. I have added test.py with MNIST test example just now. Please check commit https://github.com/victoresque/pytorch-template/commit/df5e94940a155d746a74f08b67a700904bc2edef and tell me if anything feels not good!

amjltc295 commented 5 years ago

Such a quick response! The test.py file seems good to me.

My original idea was to merge the main part of _train_epoch() and _valid_epoch() in trainer.py into _do_epoch() since they are similar.

For example, if I want to change the input structure, say a (x1, x2) as Siamese input, data, target = data.to(self.device), target.to(self.device) would become (x1, x2), target = (x1.to(self.device), x1.to(self.device)), target.to(self.device).

To do this, _train_epoch(), _valid_epoch() and main() in test.py all have to be modified. If there is a _do_epoch(), _train_epoch(), _valid_epoch() and _test() could call it with their different conditions.

However, _do_epoch() would be more complicated since it would need some more parameters and if-else conditions, so I'm not quite sure if the _do_epoch() is a good idea. I guess merging the data, target = data.to(self.device), target.to(self.device) in both _train_epoch() and _valid_epoch() to a _data_to_device() function would be a lot easier.

Lastly, personally I would prefer sending a pull request in stead of directly push to the master so that others could review the code and notice the changes.

Thanks for your contribution :)

SunQpark commented 5 years ago

Oh, I'm sorry for making direct pushes carelessly, just now have I noticed that my commits look messy. The test.py code is some modified version of code in one of my previous project, about kaggle competition. In that competition, the annotation of test data was not available, and purpose of the test.py code was to produce model predictions and writing them into a submission file. I think having separate test.py file is more suitable, considering that people may want test code for various reasons. Of course we can make the testing as one of methods of trainer, but I think the trainer is too heavy for that. trainer has so much things not related to test, like optimizer, scheduler, logger and so on.

Apart from testing, _do_epoch would make us recycle somewhat amount of code, but I think that shared amount is not that large, compared to the amount of works to be done for merging the methods.

For the _data_to_device method, indeed this template had method called to_variable before, which had handled changing the Tensors to the Variables(pytorch 0.3), and moving them into the GPU. Recovering that would not difficult, but I'm not sure that having additional method would be better than current state. That code for moving siamese data into GPU may become slightly easier(one less copy-paste), but in that way one should pack data into tuple, move it to GPU, and unpack the returned tensors.

amjltc295 commented 5 years ago

I think having a test.py is a good idea since this project is the most suitable for competitions, homework and small experiments. Let's leave it there.

I would try to work on the do_epoch thing later. As for the _to_device method, it's good to have.

victoresque / pytorch-template

Add test.py or infer.py #26