mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.6k stars 553 forks source link

[RetinaNet] Fixed +1 epoch in validation #552

Closed ahmadki closed 2 years ago

github-actions[bot] commented 2 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

johntran-nv commented 2 years ago

@erichan1 who do you think should approve this one? Or do we need to discuss live in the meeting?

erichan1 commented 2 years ago

I think it's ok to merge now if it's a technical fix. I've included it in the agenda for the next WG meeting to make sure people are aware.

@ahmadki Could you explain this PR? Is this just to change 0-indexed epochs to 1-indexed epochs? ie you train on the 0th epoch, and then when you save the checkpoint you want to say this is retinanet_model_1_epoch.ckpt