mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.6k stars 553 forks source link

BERT reference pretrained checkpoint is not publically available #466

Closed matthew-frank closed 1 year ago

matthew-frank commented 3 years ago

The BERT reference requires starting from a specified pretrained checkpoint. https://github.com/mlcommons/training/tree/master/language_model/tensorflow/bert/README.md points to a Google Cloud Storage URL https://console.cloud.google.com/storage/browser/pkanwar-bert and claims that the reference checkpoint required is available from that URL.

But something is not right about the contents of that URL. Either the data isn't there or the permissions are currently not set correctly to allow access to anyone outside of Google. All that appears at that URL is an error message Additional permissions required to list objects in this bucket: Ask a bucket owner to grant you 'storage.objects.list' permission.

matthew-frank commented 3 years ago

This will be addressed by https://github.com/mlcommons/training/pull/463

matthew-frank commented 3 years ago

This is a duplicate of https://github.com/mlcommons/training/issues/444 which was closed without fixing the README.

matthew-frank commented 3 years ago

This is also a duplicate of https://github.com/mlcommons/training/issues/462.

TheKanter commented 3 years ago

https://drive.google.com/drive/u/4/folders/1oQF4diVHNPCclykwdvQJw8n_VIWwV0PT has them, but we need to update documentation with instructions.

matthew-frank commented 3 years ago

David- Yes. Please get someone to approve and merge PR 463. (https://github.com/mlcommons/training/pull/463)

johntran-nv commented 1 year ago

This was merged a long time ago. Closing.