pkuCactus / BDCN

The code for the CVPR2019 paper Bi-Directional Cascade Network for Perceptual Edge Detection
MIT License
341 stars 71 forks source link

Training and Resuming #23

Closed huberthomas closed 5 years ago

huberthomas commented 5 years ago

Can someone explain me how to train and resume from a pretrained model, e.g. I used the available pretrained bsds500 model, than I added my new data dir in cfg.py under the bsds500 key and started the training with

python train.py -p models/bdcn_pretrained_on_bsds500.pth -c --complete_pretrain models/bdcn_algo_on_bsds500.pth

I stopped and now I want to continue the training on the same parameter setting but from the latest point stored in params, can someone tell me how to do that? I saw there exists the --resume parameter and tried out different execution combinations but they all ended in an error (KeyError: 'step').

Big thanks in advance!

pkuCactus commented 5 years ago

you need to use the another checkpoint end with .tar which stores the optimize state to resume.

huberthomas commented 5 years ago

Here is my solution, e.g.:

python train.py -c --complete_pretrain params/bdcn_26000.pth --resume params/bdcn_26000.pth.tar

but I had to additionally comment out the following lines in train.py:

if args.resume:
...
# for x in state.__dict__:
#    logger.info(x)
...