blaze / odo

Data Migration for the Blaze Project
http://odo.readthedocs.org/
BSD 3-Clause "New" or "Revised" License
1k stars 138 forks source link

ODO cannot access S3 with S3ResponseError: 403 Forbidden #559

Open spandanbrahmbhatt opened 7 years ago

spandanbrahmbhatt commented 7 years ago

I am using odo library to transfer a pandas dataframe to S3. However I am getting following error :

    import pandas as pd
    df = pd.DataFrame([[1, 2], [3, 4], [5, 6], [7, 8]], columns=["A", "B"])
    odo(df,'s3://path_to_s3_folder')
    S3ResponseError: S3ResponseError: 403 Forbidden
    <?xml version="1.0" encoding="UTF-8"?>
    <Error><Code>AccessDenied</Code><Message>Anonymous access is forbidden for this operation</Message><RequestId>F5958774D56AD29E</RequestId><HostId>zOH8JOxpSgB5Scgc/YrtHO1+e9lXoKAF89IhRSeAiSoGHAJxyjXKBVFIYETeO4gSLZOUgXmwKLM=</HostId></Error>

Now I have the AWS credentials setup correctly as I can see in my ~/.aws/credentials file

    cat credentials
    [default]
    aws_access_key_id = XXXXX
    aws_secret_access_key = XXXXXXXXXX

The aws cli works correctly for me and I can run aws ls and cp commands correctly (I guess this means I do have the required permissions).

    aws s3 ls s3://path_to_s3

Also boto3 is able to access s3 resources and does not give an error.

    import boto3
    s3 = boto3.resource('s3')
    for bucket in s3.buckets.all():
        print(bucket.name)

What would be possible wrong/missing in the configuration ?

paulochf commented 6 years ago

I went further:

from odo import resource
from os import environ

env = os.environ

resource(
    "s3://path/to/file/*.csv.gz",
    aws_access_key_id=env["AWS_ACCESS_KEY_ID"],
    aws_secret_access_key=env["AWS_SECRET_ACCESS_KEY"]
)

Data is there, can access it via HTTP, but got the same error message as OP.

---------------------------------------------------------------------------
S3ResponseError                           Traceback (most recent call last)
<ipython-input-9-1384a4ddea98> in <module>()
      1 resource("s3://path/to/file/*.csv.gz",
      2              aws_access_key_id=aws_envs["AWS_ACCESS_KEY_ID"],
----> 3              aws_secret_access_key=aws_envs["AWS_SECRET_ACCESS_KEY"])

my_project/lib/python3.6/site-packages/odo/regex.py in __call__(self, s, *args, **kwargs)
     89 
     90     def __call__(self, s, *args, **kwargs):
---> 91         return self.dispatch(s)(s, *args, **kwargs)
     92 
     93     @property

my_project/lib/python3.6/site-packages/odo/backends/aws.py in resource_s3_csv_glob(uri, **kwargs)
    157     con = get_s3_connection()
    158     result = urlparse(uri)
--> 159     bucket = con.get_bucket(result.netloc)
    160     key = result.path.lstrip('/')
    161 

my_project/lib/python3.6/site-packages/boto/s3/connection.py in get_bucket(self, bucket_name, validate, headers)
    507         """
    508         if validate:
--> 509             return self.head_bucket(bucket_name, headers=headers)
    510         else:
    511             return self.bucket_class(self, bucket_name)

my_project/lib/python3.6/site-packages/boto/s3/connection.py in head_bucket(self, bucket_name, headers)
    540             err.error_code = 'AccessDenied'
    541             err.error_message = 'Access Denied'
--> 542             raise err
    543         elif response.status == 404:
    544             # For backward-compatibility, we'll populate part of the exception

S3ResponseError: S3ResponseError: 403 Forbidden