mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.62k stars 560 forks source link

Unable to download tar file in the mlcommons-training-wg-s3 S3 Bucket #693

Closed ajscalers closed 3 months ago

ajscalers commented 11 months ago

Hi,

I was trying to download the S3 artifacts for Megatron, as specific inn the training/large_language_model/megatron-lm/README.md file. I tried the following code:

import boto3
from botocore.exceptions import NoCredentialsError

def download_file_from_s3(bucket_name, object_key, destination_path):
    # Initialize a session using AWS S3
    s3 = boto3.client('s3', aws_access_key_id='3ZC41B4Z2WHM5DT2', 
    s3.download_file(bucket_name, object_key, destination_path)

# Replace these values with your own
bucket_name = 'https://s3.us-east-1.lyvecloud.seagate.com/mlcommons-training-wg-s3'
object_key = 'gpt3/megatron-lm/checkpoint_megatron_fp32.tar'
destination_path = '.'

# Call the function to download the file
download_file_from_s3(bucket_name, object_key, destination_path)

However, I get the following error: botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden.

I also tried listing the files in the bucket using

import boto3
from botocore.exceptions import NoCredentialsError

def download_file_from_s3(bucket_name, object_key, destination_path):
    # Initialize a session using AWS S3
    s3 = boto3.client('s3', aws_access_key_id='3ZC41B4Z2WHM5DT2', aws_secret_access_key='AK4NQQZV0NKFEJWJUZVPX5XQ0QNTXCGW')
    response = s3.list_objects_v2(
        Bucket=bucket_name,
        # Prefix="gpt3/megatron-lm/"
    )
    print(response)

# Replace these values with your own
bucket_name = 'mlcommons-training-wg-s3'
object_key = 'https://s3.us-east-1.lyvecloud.seagate.com'
destination_path = '.'

# Call the function to download the file
download_file_from_s3(bucket_name, object_key, destination_path)

This results in the following error: botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the ListObjectsV2 operation: The AWS Access Key Id you provided does not exist in our records.

ShriyaPalsamudram commented 3 months ago

@ajscalers is this still as issue since it has been more than 6 months.

If so, @nathanw-mlc is this something you could help with?

ajscalers commented 3 months ago

Hi @ShriyaPalsamudram. Sorry, forgot about this issue.

We were able to download the model by following the instructions in the MLCommons inference repo.