Open pervaizniazi opened 5 years ago
The issue begins in the lr_step field in the config file.
lr: 0.0005
lr_step: '4.83'
warmup: true
warmup_lr: 0.00005
# typically we will use 4000 warmup step for single GPU on VOC
warmup_step: 1000
In the call to get the learning rate scheduler:
# decide learning rate
base_lr = lr
lr_factor = config.TRAIN.lr_factor
lr_epoch = [float(epoch) for epoch in lr_step.split(',')]
lr_epoch_diff = [epoch - begin_epoch for epoch in lr_epoch if epoch > begin_epoch]
lr = base_lr * (lr_factor ** (len(lr_epoch) - len(lr_epoch_diff)))
lr_iters = [int(epoch * len(roidb) / batch_size) for epoch in lr_epoch_diff]
print('lr', lr, 'lr_epoch_diff', lr_epoch_diff, 'lr_iters', lr_iters)
lr_scheduler = WarmupMultiFactorScheduler(lr_iters, lr_factor, config.TRAIN.warmup,
config.TRAIN.warmup_lr, config.TRAIN.warmup_step)
Note that steps
in your error call is lr_iters
, if you follow the logic here you will see that lr_epoch=[4.83]
and this means the lr_epoch_diff = [epoch - begin_epoch for epoch in lr_epoch if epoch > begin_epoch]
is an empty list because the if will never be satisfied if begin_epoch > lr_step
.
I don't have a fix for this. I'd be happy for more action here, its a pretty serious flaw.
Hello, I need to resume training but getting following exception:
File "experiments/fpn/../../fpn/../lib/utils/lr_scheduler.py", line 29, in init assert isinstance(step, list) and len(step) >= 1 AssertionError
I have made following changes in .yaml file: begin_epoch: 76 end_epoch: 100
Any help will be much appreciated.
Thanks