mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.62k stars 560 forks source link

Llama2 - LoRA Reference Implementation #727

Closed rgandikota closed 4 months ago

rgandikota commented 7 months ago
  1. The Readme points to using a eval.py, which is missing from the llama2 scripts folder.
  2. Instructions to run this reference implementation on multiple nodes would be helpful for anyone looking to use it as-is.
  3. Would be helpful to new submitters if the training time for the reference run is documented.
itayhubara commented 7 months ago
  1. I'll delete it - it is just confusing
  2. @michal2409 can you open a PR to update the README with multi-node
  3. I can add one log but we don't usually do that - @nv-rborkar what do you think
rgandikota commented 6 months ago

@itayhubara Could you please let us know if we can use this dataset from hugging face instead of the parquet files from the Google Drive? From the ReadMe instructions, looks like this is the dataset the parquet files https://huggingface.co/datasets/tau/scrolls/blob/main/gov_report.zip

hiwotadese commented 4 months ago

@rgandikota can we close this issue?