Tony-Y / pytorch_warmup

Learning Rate Warmup in PyTorch
https://tony-y.github.io/pytorch_warmup/
MIT License
392 stars 25 forks source link

How to schedule LR with warmup on global_step initially, and then epoches after warmup? #16

Closed JohnHerry closed 10 months ago

JohnHerry commented 10 months ago

Hi, Tony, I have a request, that during warmup training in the first epoch, the warmup-scheduler can adjust learning rate every step [or every N steps], and after the warmup stage, we will use regular lr-schedular to adjust the learning-rate every epoch. Is there any example code about it?

Tony-Y commented 10 months ago

Do you want one like an LR scheduler with warmup provided by Hugging Face?

Tony-Y commented 10 months ago

I'm sorry for misunderstanding your request. You have to combine a global-step and epoch LR scheduler, but it could not be as easy as it sounds. I don't know such example codes, but I'll let you know if I find.

Tony-Y commented 10 months ago

I made sample code: https://gist.github.com/Tony-Y/49d6cffa21e60095fdf9b1bec31cdbaa

    for batch_idx, (data, target) in enumerate(progressbar(train_loader)):
        ...
        extra_params["global_step"] += 1
        if extra_params["global_step"] <= extra_params["warmup_period"]:
            with warmup_scheduler.dampening():
                pass
        elif (extra_params["global_step"] - extra_params["warmup_period"]) % len(train_loader) == 0:
            lr_scheduler.step()
Tony-Y commented 10 months ago

I revised the code: https://gist.github.com/Tony-Y/1aa2196ce161d8a4da90cf027ec0f260

New code:

class EpochSchedulerWithWarmup:
  def __init__(self, warmup_period, every_n_steps, steps_per_epoch, warmup_scheduler, lr_scheduler):
    self.global_step = 0
    self.warmup_period = warmup_period
    self.every_n_steps = every_n_steps
    self.steps_per_epoch = steps_per_epoch
    self.warmup_scheduler = warmup_scheduler
    self.lr_scheduler = lr_scheduler

  def step(self):
    self.global_step += 1
    if self.global_step <= self.warmup_period and self.global_step % self.every_n_steps == 0:
      with self.warmup_scheduler.dampening():
        pass
    elif (self.global_step - self.warmup_period) % self.steps_per_epoch == 0:
      self.lr_scheduler.step()

Usage:

lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[num_epochs//3], gamma=0.1)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
lr_scheduler_with_warmup = EpochSchedulerWithWarmup(
    warmup_period=2000, every_n_steps=1,
    steps_per_epoch=len(dataloader),
    warmup_scheduler=warmup_scheduler,
    lr_scheduler=lr_scheduler)

for epoch in range(1,num_epochs+1):
    for iter, batch in enumerate(dataloader):
        optimizer.zero_grad()
        loss = ...
        loss.backward()
        optimizer.step()
        lr_scheduler_with_warmup.step()

Does this code resolve your issue?

Tony-Y commented 10 months ago

EpochSchedulerWithWarmup has a bug. Its step() should be:

  def step(self):
    self.global_step += 1
    if self.global_step <= self.warmup_period:
      if self.global_step % self.every_n_steps == 0:
        with self.warmup_scheduler.dampening():
          pass
    elif (self.global_step - self.warmup_period) % self.steps_per_epoch == 0:
      self.lr_scheduler.step()
Tony-Y commented 10 months ago

@JohnHerry

I checked every_n_steps works well: https://gist.github.com/Tony-Y/6c79267cab84f3d0a2309f25a9123da4

MultiStepLRWithWarmup MultiStepLRWithWarmup-ZoomUp

I think this issue was resolved. Reopen this issue if not.

JohnHerry commented 10 months ago

Thank you very much for the kindly help. I will have a try.

JohnHerry commented 9 months ago

Thank you very much for the kindly help. I will have a try.

Thank you for all the help, They are very effective! the training process is stable now. thanks

Tony-Y commented 9 months ago

@JohnHerry

I'm happy to hear that.