mlcommons / training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
https://mlcommons.org/en/groups/training
Apache License 2.0
92 stars 66 forks source link

[HPC] add note about pruned logs validity #499

Closed sparticlesteve closed 1 year ago

sparticlesteve commented 2 years ago

Added a bullet to hpc training rules throughput measurement section clarifying that pruned logs should be rules-compliant.

Fixes https://github.com/mlcommons/training_policies/issues/493

github-actions[bot] commented 2 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

azrael417 commented 2 years ago

We need to augment this PR with an issue against mlperf logging. The compliance, RCP checkers and result summarizers need to be made aware of this. For example, we need to define how a submitter specifies pruned logs and need to adjust these tools accordingly.

sparticlesteve commented 2 years ago

Copying my discord comment here.

the pruned logs go into a pruned subfolder as before. That's explained in the submission rules. https://github.com/mlcommons/policies/blob/master/submission_rules.adoc#562-hpc

We may not even need to actually run the checker on them. It's only really needed in the case that someone does HP borrowing for throughput measurement and has pruned logs. An easy thing to do would be to not change the logging package at all in this respect. If needed, we can inspect pruned logs manually or run the checker on them manually.

azrael417 commented 2 years ago

ok, that makes sense, thank you