Performance of ResNet50 on ImageNet

Hi @juntang-zhuang, In our paper, we only trained 60 epochs due to the cost of training. Typically people train 200 - 300 or so epochs on ImageNet if the goal is to maximize the final model accuracy. In our case, our purpose was to compare AdamW and Ranger21 optimizers, head to head, with all other variables the same - i.e. straightforward set of transformations, and spending < $1K on training (i.e. fixed compute budget). Thus, the only takeaway from our paper is that Ranger21 outperforms AdamW assuming all other variables are identical (same transformations, same epochs total training). It says nothing about Keras results, Pytorch results, or SGD results as we would need to replicate the same total number of epochs and augmentation pipeline done in those papers, while training with Ranger21, to do a direct apple->apple comparison. Hope that helps! Less

lessw2020 / Ranger21

Performance of ResNet50 on ImageNet #27