Lightning-AI / pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
27.47k stars 3.3k forks source link

Resume training, how to change learning scheduler? #19865

Open jzhanghzau opened 1 month ago

jzhanghzau commented 1 month ago

Bug description

I used the cosine learning scheduler in my first round training and after reaching the set max steps, I stopped the training. And now I need to restart the training and replace the learning rate scheduler with a new one because if I don't, the learning rate should always be 0. My question is how to restarting the training with a new learning rate scheduler?

I just replace a new learning rate scheduler indef configure_optimizers() and then set last checkpoint intrainer.fit()?

Thanks so much !

JJ

What version are you seeing the problem on?

master

How to reproduce the bug

No response

Error messages and logs

# Error messages and logs here please

Environment

Current environment ``` #- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow): #- PyTorch Lightning Version (e.g., 1.5.0): #- Lightning App Version (e.g., 0.5.2): #- PyTorch Version (e.g., 2.0): #- Python version (e.g., 3.9): #- OS (e.g., Linux): #- CUDA/cuDNN version: #- GPU models and configuration: #- How you installed Lightning(`conda`, `pip`, source): #- Running environment of LightningApp (e.g. local, cloud): ```

More info

No response

jzhanghzau commented 1 month ago

I only want to load weights, everything else will be initialized, how can I do this?

By the way, I want combine this with lightingCLI?

Thanks in advance!

Bests,

JJ