Resume hrnet model - Githubissues

NaifahNurya commented 4 years ago

Thank you for making this work as an Open-source.
Currently, I use the code to train on my custom dataset, however getting some errors and solved it. Now the training is in progress. I use hrnet, and it seems the training going very slowly in window 10 with NVIDIA Ge Force GTX 1080. only 4-epoch per day.

I use a batch size of 2 and 30 epoch, I expect the training to be completed this Saturday. However, I noticed that id-loss will not be small enough to use the model for testing. As suggested by @ifzhang the training on hrnet it can perform better at least with 60 epoch https://github.com/ifzhang/FairMOT/issues/57

I don't want to start again, I want to proceed from 30 epoch to 60 epoch. Is there a means of resuming the model from 30 epoch? If so, what should I do to achieve this?

Thank you

I attach my training log file which still in the training process (currently around 20 epoch)

ifzhang commented 4 years ago

You can add --resume in the training sh file, so it can resume from the latest epoch.

NaifahNurya commented 4 years ago

Thank you @ifzhang

ifzhang / FairMOT

Resume hrnet model #147