tumurzakov / AnimateDiff

AnimationDiff with train
Apache License 2.0
111 stars 28 forks source link

Training problem, scheduler not stepping #11

Closed ezra-ch closed 1 year ago

ezra-ch commented 1 year ago

output

Steps:  20%|████████████▍                                                 | 2/10 [00:05<00:20,  2.55s/it, lr=3e-5, step_loss=0.0871]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
Steps:  30%|██████████████████▉                                            | 3/10 [00:07<00:15,  2.28s/it, lr=3e-5, step_loss=0.185]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
Steps:  40%|████████████████████████▊                                     | 4/10 [00:09<00:12,  2.15s/it, lr=3e-5, step_loss=0.0842]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
Steps:  50%|███████████████████████████████▌                               | 5/10 [00:11<00:10,  2.08s/it, lr=3e-5, step_loss=0.394]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
Steps:  60%|█████████████████████████████████████▏                        | 6/10 [00:13<00:08,  2.04s/it, lr=3e-5, step_loss=0.0314]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
Steps:  70%|██████████████████████████████████████████▋                  | 7/10 [00:15<00:06,  2.02s/it, lr=3e-5, step_loss=0.00743]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
Steps:  80%|██████████████████████████████████████████████████▍            | 8/10 [00:17<00:03,  2.00s/it, lr=3e-5, step_loss=0.194]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}
Steps:  90%|███████████████████████████████████████████████████████▊      | 9/10 [00:19<00:01,  1.99s/it, lr=3e-5, step_loss=0.0135]{'base_lrs': [3e-05], 'last_epoch': 0, 'verbose': False, '_step_count': 1, '_get_lr_called_within_step': False, '_last_lr': [3e-05], 'lr_lambdas': [None]}

is it supposed to not step? I was trying to use lr_scheduler: cosine and the learning rate is still constant

                accelerator.backward(loss)
                if accelerator.sync_gradients:
                    accelerator.clip_grad_norm_(unet.parameters(), max_grad_norm)
                optimizer.step()
                lr_scheduler.step()
                optimizer.zero_grad()
                print(lr_scheduler.state_dict())