mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.57k stars 549 forks source link

[DCNV2] Add MLPerf logging #616

Closed janekl closed 1 year ago

janekl commented 1 year ago

Author: Jan Lasek (jlasek_at_nvidia.com)

Reopening https://github.com/mlcommons/training/pull/615 after signing MLCommons CLA.

github-actions[bot] commented 1 year ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

janekl commented 1 year ago

It looks like that I'm still not a member despite signing the CLA last week. Could you please add me @johntran-nv or @erichan1? Thanks

morphine00 commented 1 year ago

recheck

janekl commented 1 year ago

I also added logging seed with a remark that it's not working for model weights. I'll file an issue for that to make it clear to everyone. https://github.com/mlcommons/training/blob/bc163f0540575adf4a73c056d870f16daef89a3f/recommendation_v2/torchrec_dlrm/dlrm_main.py#L713-L716