mlcommons / training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
https://mlcommons.org/en/groups/training
Apache License 2.0
93 stars 66 forks source link

BERT dataset is missing. #403

Closed johntran-nv closed 3 years ago

johntran-nv commented 4 years ago

We need to find a place to host it. Last suggestion was MLCommons.

johntran-nv commented 4 years ago

filing in training_policies but there's also an issue in training already: https://github.com/mlperf/training/issues/377.

bitfort commented 4 years ago

SWG:

We will consult internally at MLCommons for the appropriate place to host this, such as at MLCommons. David will get back.

bitfort commented 3 years ago

SWG:

This continues be blocked on finding where to host it. We have made some progress with legal.

TheKanter commented 3 years ago

The Wikipedia data set for BERT is available here: https://drive.google.com/drive/folders/1oQF4diVHNPCclykwdvQJw8n_VIWwV0PT?usp=sharing Note that you must be a member of MLPerf Training to access.

johntran-nv commented 3 years ago

One remaining question: do we intend to make this available to the public as well? @TheKanter

shoveller86 commented 3 years ago

The Wikipedia data set for BERT is available here: https://drive.google.com/drive/folders/1oQF4diVHNPCclykwdvQJw8n_VIWwV0PT?usp=sharing Note that you must be a member of MLPerf Training to access.

How to be a member of MLPerf Training?

bitfort commented 3 years ago

You can get started here; https://mlcommons.org/en/get-involved/#getting-started

Please let me know if you have any questions!

shoveller86 commented 3 years ago

You can get started here; https://mlcommons.org/en/get-involved/#getting-started

Please let me know if you have any questions!

Thank you, I'll try to join in Member