mlcommons / training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
https://mlcommons.org/en/groups/training
Apache License 2.0
92 stars 66 forks source link

Clarify that RCP interpolation can be between any two RCPs (one higher, one lower) #451

Open petermattson opened 3 years ago

nv-rborkar commented 2 years ago

We also have some RCPs which break the non-decreasing requirement with respect to increasing batch-size. This is possible if better hparams were not known at the time these RCPs were generated.

WG can discuss two approaches to discuss this: (A) Delete the RCP points which break non-decreasing trendline or (B) RCP-checker should interpolate between neighboring or any two RCPs if a direct check at the RCP-batch size fails.

cc: @emizan76

emizan76 commented 2 years ago

I vote for (A). Last time we did not do that and there was confusion with submission-time RCPs practically overwriting the original ones. We should have convincing curves for all RCPs w.r.to batch size.