Closed JohnHerry closed 10 months ago
Do you want one like an LR scheduler with warmup provided by Hugging Face?
I'm sorry for misunderstanding your request. You have to combine a global-step and epoch LR scheduler, but it could not be as easy as it sounds. I don't know such example codes, but I'll let you know if I find.
I made sample code: https://gist.github.com/Tony-Y/49d6cffa21e60095fdf9b1bec31cdbaa
for batch_idx, (data, target) in enumerate(progressbar(train_loader)):
...
extra_params["global_step"] += 1
if extra_params["global_step"] <= extra_params["warmup_period"]:
with warmup_scheduler.dampening():
pass
elif (extra_params["global_step"] - extra_params["warmup_period"]) % len(train_loader) == 0:
lr_scheduler.step()
I revised the code: https://gist.github.com/Tony-Y/1aa2196ce161d8a4da90cf027ec0f260
New code:
class EpochSchedulerWithWarmup:
def __init__(self, warmup_period, every_n_steps, steps_per_epoch, warmup_scheduler, lr_scheduler):
self.global_step = 0
self.warmup_period = warmup_period
self.every_n_steps = every_n_steps
self.steps_per_epoch = steps_per_epoch
self.warmup_scheduler = warmup_scheduler
self.lr_scheduler = lr_scheduler
def step(self):
self.global_step += 1
if self.global_step <= self.warmup_period and self.global_step % self.every_n_steps == 0:
with self.warmup_scheduler.dampening():
pass
elif (self.global_step - self.warmup_period) % self.steps_per_epoch == 0:
self.lr_scheduler.step()
Usage:
lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[num_epochs//3], gamma=0.1)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
lr_scheduler_with_warmup = EpochSchedulerWithWarmup(
warmup_period=2000, every_n_steps=1,
steps_per_epoch=len(dataloader),
warmup_scheduler=warmup_scheduler,
lr_scheduler=lr_scheduler)
for epoch in range(1,num_epochs+1):
for iter, batch in enumerate(dataloader):
optimizer.zero_grad()
loss = ...
loss.backward()
optimizer.step()
lr_scheduler_with_warmup.step()
Does this code resolve your issue?
EpochSchedulerWithWarmup
has a bug. Its step()
should be:
def step(self):
self.global_step += 1
if self.global_step <= self.warmup_period:
if self.global_step % self.every_n_steps == 0:
with self.warmup_scheduler.dampening():
pass
elif (self.global_step - self.warmup_period) % self.steps_per_epoch == 0:
self.lr_scheduler.step()
@JohnHerry
I checked every_n_steps
works well: https://gist.github.com/Tony-Y/6c79267cab84f3d0a2309f25a9123da4
I think this issue was resolved. Reopen this issue if not.
Thank you very much for the kindly help. I will have a try.
Thank you very much for the kindly help. I will have a try.
Thank you for all the help, They are very effective! the training process is stable now. thanks
@JohnHerry
I'm happy to hear that.
Hi, Tony, I have a request, that during warmup training in the first epoch, the warmup-scheduler can adjust learning rate every step [or every N steps], and after the warmup stage, we will use regular lr-schedular to adjust the learning-rate every epoch. Is there any example code about it?