mlcommons / training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
https://mlcommons.org/en/groups/training
Apache License 2.0
93 stars 66 forks source link

Bar For HP Table Modification #284

Open bitfort opened 4 years ago

bitfort commented 4 years ago

SWG Discussion on HPs:

To sort through the HP challenges for v0.7, we are looking at roughly these tasks:

  1. Clearly list all HPs from v0.6 and their rules
  2. Choose an alignment strategy (for consistency and sanity)
  3. Refactor v0.6 HPs based on alignment strategy
  4. Enumerate proposed changes (Add rows, remove rows, change rows)
  5. Define bar for accepting changes.
  6. Apply the bar to the proposed changes.

What is our fundamental goal? Create a level playing field where everyone can demonstrate their best performance possible weighed against expense of HP searches.

Should we accept this change to the hyper parameter table (for an existing optimizer)?

  1. The proposal would take the form of removing constraints or adding new tunable hyper parameter (AKA removing the constraint that the HP must be equal to a constant reference).
  2. Show that it changes the expected number of epochs. Run under old rules and run under new rules N times each; show that olympic scored epoch count improve by at least 5%.
  3. Show that it enables a batch size of more/less than P% than the current largest or smallest submitted batch size.
bitfort commented 4 years ago

SWG Discussion:

To show something "didn't converge", run 5% more steps/epochs than the comparison point and show the quality metric didn't reach the target.

petermattson commented 4 years ago

Re-examine this for v0.8 and incorporate into rules