Problems in SequentialTrainer.lua

ruotianluo commented 8 years ago

line 45: _db_name, should be self._db_name 136: _optimState should be self._optimState

mahyarnajibi commented 8 years ago

Thank you for pointing my attention to this. I modified the resume training option. The issues are addressed in commit 92a2b2a.

ruotianluo commented 8 years ago

I've pulled your new commits, but it seems there's no error when resume training; however, the loss shows that the model is still trained from scratch. I'm not sure if it's my problem; can you put more detail on resume training in README? Thank you.

mahyarnajibi commented 8 years ago

Sure I will. If you trained a model with this code you should be able to start from the model again. Each time the code saves the network, three files are written to disk, one contains the model weights, one contains the optimizer state which has the gradient information for applying momentum and a text file that shows what was the configuration during the training.

For resuming the training, you should set the -resume_training flag to true in the config file and set the -pre_trained_file to the model that you want to continue training from (not the imagenet pre-trained model). If everything goes well you should see these lines in your terminal: "Copying weights from classifier layer!"; "Copying weights from regressor layer!"; "Preparing the regression layer weights..." and the iteration number should be started from where you stopped training your saved model. Also, if you see "The optimizer state is not found for continuing the training, a new state is being used!" it means that the optimizer state could not be accessed. Can you check the terminal output to see if these lines are printed or not?

ruotianluo commented 8 years ago

It's correct now, thank you very much.

mahyarnajibi / fast-rcnn-torch

Problems in SequentialTrainer.lua #6