DiffEqML / torchdyn

A PyTorch library entirely dedicated to neural differential equations, implicit models and related numerical methods
https://torchdyn.org
Apache License 2.0
1.36k stars 125 forks source link

MisconfigurationException in 2nd Tutorial (Classification) #43

Closed mrpositron closed 3 years ago

mrpositron commented 3 years ago

I am going over 2nd Torchdyn Tutorial and I encountered this error: image

This error happens when I run this command: image

I think that this error is caused by PytorchLighting. However, I am a newbie to PytorchLighting. Thus, I don't know how to debug it

Zymrael commented 3 years ago

Hi @MrPositron, thank you for pointing this out. Could you let me know which version of torch you're currently using?

mrpositron commented 3 years ago

PyTorch version is 1.6. I ran it on colab. Furthermore, changing

sched = {'scheduler': torch.optim.lr_scheduler.ReduceLROnPlateau(opt),
                 'monitor': 'loss', 
                 'interval': 'step',
                 'frequency': 10  }

to

sched = {'scheduler': torch.optim.lr_scheduler.ReduceLROnPlateau(opt),
                 'monitor': 'train_loss', 
                 'interval': 'step',
                 'frequency': 10  }

does not result in error.

Zymrael commented 3 years ago

Make sure to have the metric you choose to monitor available to the LightningModule after each step. On my end the training loop works with both loss and train_loss (pytorch-lightning==0.9.0).

The following step method:

 def training_step(self, batch, batch_idx):
        self.iters += 1.
        x, y = batch   
        x, y = x.to(device), y.to(device)
        y_hat = self.model(x)   
        loss = nn.CrossEntropyLoss()(y_hat, y)
        epoch_progress = self.iters / self.loader_len
        acc = accuracy(y_hat, y)
        nfe = model[1].nfe ; model[1].nfe = 0
        tqdm_dict = {'train_loss': loss, 'accuracy': acc, 'NFE': nfe}
        logs = {'train_loss': loss, 'epoch': epoch_progress}
        return {'loss': loss, 'progress_bar': tqdm_dict, 'log': logs} 

Should have loss available. Could you confirm that the MisconfigurationException is raised even with the above?

Zymrael commented 3 years ago

The new version of pytorch-lightning has fixed the issue, closing.