mlcommons / training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
https://mlcommons.org/en/groups/training
Apache License 2.0
93 stars 66 forks source link

adding checkpoint related rules to the training rules #274

Open christ1ne opened 4 years ago

christ1ne commented 4 years ago

I'd like to add a few points in terms of the following considering we are using checkpoints in more than 1 BM.

bitfort commented 4 years ago

Check points show up in benchmarks such as Bert and Minigo. Working with checkpoints can be difficult, especially when using them with different frameworks.

Having a consistent policy around check points would be hugely helpful to submitters; this would include documentation, how they are made available, and in what formats they come in, etc.

Next steps: a proposal for what this would look like, followed by an example using an existing check point.