mlcommons / algorithmic-efficiency

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
https://mlcommons.org/en/groups/research-algorithms/
Apache License 2.0
319 stars 60 forks source link

[fix] random_utils.py to `_signed_to_unsigned` #739

Closed tfaod closed 4 months ago

tfaod commented 4 months ago

When running the submission_runner on the self-tuning track, we run into this error calling _signed_to_unsigned from random_utils.py.

I've added a fix

    rng = prng.PRNGKey(rng_seed)
  File "/private/home/axyang/optimization/algorithmic-efficiency-entry/algorithm
ic_efficiency/random_utils.py", line 79, in PRNGKey
    return _PRNGKey(seed)
tfaod commented 4 months ago

Error comes from the following line in the self-tuning infra: https://github.com/mlcommons/algorithmic-efficiency/blob/576d5e37c318f1d04398206ed2a781dd1dac0a56/submission_runner.py#L610

And crashes function score_submission_on_workload before training can start