princeton-vl / pose-ae-train

Training code for "Associative Embedding: End-to-End Learning for Joint Detection and Grouping"
BSD 3-Clause "New" or "Revised" License
373 stars 76 forks source link

training not saving model or logging when continuing training #27

Closed crockwell closed 5 years ago

crockwell commented 6 years ago

When trying to continue training, experiment stops saving checkpoints and logging. This is because there is no handler for experiment name to come after '-c'.

Meaning, 'python train.py -e test_run_001' saves + logs, but 'python train.py -c test_run_001' does not

To handle saving, I suggest adding something like the following after line 64 of train.py, in the 'save' function:

if config['opt'].exp=='pose' and config['opt'].continue_exp is not None: resume = os.path.join('exp', config['opt'].continue_exp)

To handle logging, I suggest a variant of the same code from above after line 88 of task/pose.py in the make_network function:

if configs['opt'].exp=='pose' and configs['opt'].continue_exp is not None: exp_path = os.path.join('exp', configs['opt'].continue_exp)

notagenius commented 5 years ago

pass the name again as 'python train.py -c test_run_001 -e test_run_001'