Closed ezra-ch closed 1 year ago
output
Steps: 20%|████████████▍ | 2/10 [00:05<00:20, 2.55s/it, lr=3e-5, step_loss=0.0871]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]} Steps: 30%|██████████████████▉ | 3/10 [00:07<00:15, 2.28s/it, lr=3e-5, step_loss=0.185]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]} Steps: 40%|████████████████████████▊ | 4/10 [00:09<00:12, 2.15s/it, lr=3e-5, step_loss=0.0842]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]} Steps: 50%|███████████████████████████████▌ | 5/10 [00:11<00:10, 2.08s/it, lr=3e-5, step_loss=0.394]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]} Steps: 60%|█████████████████████████████████████▏ | 6/10 [00:13<00:08, 2.04s/it, lr=3e-5, step_loss=0.0314]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]} Steps: 70%|██████████████████████████████████████████▋ | 7/10 [00:15<00:06, 2.02s/it, lr=3e-5, step_loss=0.00743]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]} Steps: 80%|██████████████████████████████████████████████████▍ | 8/10 [00:17<00:03, 2.00s/it, lr=3e-5, step_loss=0.194]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]} Steps: 90%|███████████████████████████████████████████████████████▊ | 9/10 [00:19<00:01, 1.99s/it, lr=3e-5, step_loss=0.0135]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
is it supposed to not step? I was trying to use lr_scheduler: cosine and the learning rate is still constant
accelerator.backward(loss) if accelerator.sync_gradients: accelerator.clip_grad_norm_(unet.parameters(), max_grad_norm) optimizer.step() lr_scheduler.step() optimizer.zero_grad() print(lr_scheduler.state_dict())
output
is it supposed to not step? I was trying to use lr_scheduler: cosine and the learning rate is still constant