Nan loss for ResNext backbone trained on cifar 100

google-research / augmix

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

Apache License 2.0

979 stars 157 forks source link

Nan loss for ResNext backbone trained on cifar 100 #27

Open devavratTomar opened 9 months ago

devavratTomar commented 9 months ago

Thank you for your work. While trying your code for the Resnext backbone on cifar100, I get nan values for the training loss. As mentioned in the published paper, I use the initial learning rate of 0.1 for SGD with cosine scheduling.

SaraGhazanfari commented 9 months ago

Yes, same here. Could you please help with this?

Thanks, Sara