mlcommons / training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
https://mlcommons.org/en/groups/training
Apache License 2.0
92 stars 66 forks source link

Add a rule about DLRM training data shuffling #441

Open johntran-nv opened 3 years ago

johntran-nv commented 3 years ago

Shuffling rules about DLRM were not clear enough in the v0.7 round and they left a lot of room for interpretation. This update makes a clear rule that is easy to follow and should not impact convergence or performance of DLRM implementations.

This was actually part of https://github.com/mlcommons/training_policies/pull/411, which we discussed, but I mistakenly closed that thinking it was only about packing, which we no longer are using that PR for. This is cleaner to break out data shuffling into its own PR, anyway.

github-actions[bot] commented 3 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

johntran-nv commented 3 years ago

+emizan@google.com, +deepak.r.canchi@intel.com, could you please review/approve?

johntran-nv commented 3 years ago

Deepak suggested that it is too late for v1.0 to change this, which is fair. Let's defer discussion to v1.1.

Separately, it looks like I inadvertently merged this, maybe as part of another PR. I'll go fix that now as well.