[DLRMv2] Align optimizer parameters for embeddings and dense layers

janekl commented 1 year ago

Author: Jan Lasek, Nvidia (jlasek_at_nvidia.com)

There is parameter mismatch for Adagrad optimizer for embeddings and dense layers in the new DLRMv2 recommender benchmark.

As the dense layers and the embeddings employ PyTorch and FBGEMM Adagrad implementations, respectively, one needs to explicitly pass all relevant optimizer parameters in apply_optimizer_in_backward call:

Currently FBGEMM uses eps=1e-8 as default here
On the other hand, PyTorch uses eps=1e-10, see docs here.

This causes a mismatch between the two optimizer used that I'm fixing here.

github-actions[bot] commented 1 year ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

erichan1 commented 1 year ago

@samiwilf if you give this an ok I will approve.

colin2328 commented 1 year ago

LGTM. @erichan1 if you can stamp please

erichan1 commented 1 year ago

@janekl feel free to land

janekl commented 1 year ago

@janekl feel free to land

Hi @erichan1 thanks. It says You’re not authorized to merge this pull request. So I need to ask either you or @johntran-nv to merge it please.

mlcommons / training

[DLRMv2] Align optimizer parameters for embeddings and dense layers #622