google-research / augmix

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty
Apache License 2.0
979 stars 157 forks source link

Nan loss for ResNext backbone trained on cifar 100 #27

Open devavratTomar opened 9 months ago

devavratTomar commented 9 months ago

Thank you for your work. While trying your code for the Resnext backbone on cifar100, I get nan values for the training loss. As mentioned in the published paper, I use the initial learning rate of 0.1 for SGD with cosine scheduling.

SaraGhazanfari commented 9 months ago

Yes, same here. Could you please help with this?

Thanks, Sara