Open NicoZenith opened 1 month ago
Hi @NicoZenith currently there is only a learning rate decay scheduler implemented that you can configure through train_config.gamma. What options are you looking for?
@mreso this gamma factor is about decaying after each epoch, I'm looking for the scheduler that decays over iterations
@NicoZenith I see what you mean, I think it would be a great idea to provide some more flexibility with the learning rate schedule also to allow for warmup steps which we currently don't support IIRC.
My first thought was to implement it similar to custom_dataset with the option to point to a file with a function create an LRScheduler like StepLR and then a config to choose if we run step after an epoch or an iteration:
@dataclass
class LRScheduler:
scheduler: str = "llama_recipes.utils.lr_schedulers.get_step_lr"
# This is not good for customization.....
step_on_epoch_end: bool = True
step_on_iteration_end: bool = False
def __post_init__(self):
assert self.step_on_epoch_end != step_on_iteration_end, "Chose to either step after the epoch or after iteration ends, not both"
But then we don't have a great way to route parameters to the scheduler (like the gamma for step StepLR we have now) or to add custom parameters to a custom factory. Will try to give it some thoughts in the next days to come up with a design pattern that we can also reuse in other areas.
🚀 The feature, motivation and pitch
I don't see any option to set up a learning scheduler in the fine-tuning input arguments. Is there a way to implement it?
Alternatives
No response
Additional context
No response