Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.19k stars 3.38k forks source link

LearningRateMonitor doesn't log to MLFlow via mlflow.pytorch.autolog() #7911

Open notonlyvandalzzz opened 3 years ago

notonlyvandalzzz commented 3 years ago

🐛 Bug

P-L doesn't send lr data to MLFlow when mlflow.pytorch.autolog is enabled and LearningRateMonitor callback activated

To Reproduce

class EmbLt(pl.LightningModule):
    def __init__(self, ...)

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, self.parameters()), lr=self.learning_rate)
        lr_scheduler = {
            'scheduler': torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer, EPS // 3, 1),
            'name': 'lr_sched'
        }
        return [optimizer], [lr_scheduler]

lr_monitor = LearningRateMonitor(logging_interval='epoch')
emb = EmbLt(ntoken=len(vect_obj), input_size=emsize, hidden_size=nhid, learning_rate=lr, drop=drop_rate)
trainer = pl.Trainer(gpus=1, max_epochs=EPS, progress_bar_refresh_rate=20, callbacks=[lr_monitor])

with mlflow.start_run() as run:
    trainer.fit(emb, dataloaders['tr'], dataloaders['ts'])

Expected behavior

Dedicated log object inside MLFlow' current run

Environment

Additional context

Was able to override it via this trick:

class EmbLt(....)
    ....

    def on_epoch_start(self):
        eopt = self.optimizers()
        self.log('lr_curr', eopt.param_groups[0]['lr'])
justusschock commented 3 years ago

Hi @notonlyvandalzzz ,

our callbacks (like the lr monitor) only make use of our internal logging API. So when you create an MlFlow logger and pass it to the trainer, everything should work as expected.

However, this is the only logging API we offer, since there are various APIs from different logging services and they all integrate differently.

Best, Justus

notonlyvandalzzz commented 3 years ago

Oh, that makes sense Could you please update docs regarding internal/external logging borders, manual switches and so on, to make that part of documentation clear?

justusschock commented 3 years ago

Would you mind sending a PR to make sure it's clear from your side? So that we definitely know that this has been addressed in a way that helps understanding here?