mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.57k stars 548 forks source link

AccessDeniedException: 403 does not have storage.objects.list access to the Google Cloud Storage bucket. #669

Open zwang92 opened 11 months ago

zwang92 commented 11 months ago

I am trying to follow https://github.com/mlcommons/training/blob/master/large_language_model/megatron-lm/README.md#data-download to download data on gs://mlperf-llm-public2 as following: gsutil cp -r gs://mlperf-llm-public2/c4/en_val_subset_json/c4-validation_24567exp.json .

It fails with error message as following: "AccessDeniedException: 403 xxx.xxx@gmail.com does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist)."

Could anyone give any suggestion on how to download gs://mlperf-llm-public2/c4/en_val_subset_json/c4-validation_24567exp.json ?

Thanks a lot