coiled / feedback

A place to provide Coiled feedback
14 stars 3 forks source link

ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied #79

Closed leifulstrup closed 4 years ago

leifulstrup commented 4 years ago

When trying to access an s3 resource using a wildcard using:

s3_location_wildcard = 's3://bucket-name/myfilename*.csv'

df_spending = dd.read_csv(s3_location_wildcard, dtype = dtype, storage_options={"anon": True}, blocksize="16 MiB").persist()

I get this error inside s3fs/core.py:


ClientError Traceback (most recent call last) ~/opt/anaconda3/envs/coiled_env/lib/python3.8/site-packages/s3fs/core.py in _lsdir(self, path, refresh, max_items, delimiter) 420 dircache = [] --> 421 async for i in it: 422 dircache.extend(i.get('CommonPrefixes', []))

~/opt/anaconda3/envs/coiled_env/lib/python3.8/site-packages/aiobotocore/paginate.py in anext(self) 30 while True: ---> 31 response = await self._make_request(current_kwargs) 32 parsed = self._extract_parsed_response(response)

~/opt/anaconda3/envs/coiled_env/lib/python3.8/site-packages/aiobotocore/client.py in _make_api_call(self, operation_name, api_params) 150 error_class = self.exceptions.from_code(error_code) --> 151 raise error_class(parsed_response, operation_name) 152 else:

ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

The above exception was the direct cause of the following exception:

PermissionError Traceback (most recent call last)

in 4 5 #df_spending = dd.read_csv(s3_location, dtype = dtype, storage_options={"anon": True}, blocksize="16 MiB").persist() #blocksize="16 MiB", ----> 6 df_spending = dd.read_csv(s3_location_wildcard, dtype = dtype, storage_options={"anon": True}, blocksize="16 MiB").persist() #blocksize="16 MiB", 7 8 df_spending.head() ____ The individual CSV files can be read into dd.read_csv but the wildcard version throws the above error. import s3fs s3fs.__version__ '0.5.1' import dask dask.__version__ '2.28.0'
necaris commented 4 years ago

@leifulstrup are you able to provide more details about the S3 bucket you are trying to access?

leifulstrup commented 4 years ago

@necaris yes but I don't want to paste details here. I set each file for public access. I am able to read each file individually and access via pandas and dask but then when I introduce the * in place of the char that changes between files it gives me that error. I suspect that it is "operator error" by me and a setting. Does the S3 bucket with the files need to have a special type of permission so that the list of the directory items that matches can be queried? It may be a security precaution by AWS to make it harder to query an S3 directory. Is there a bucket-level setting that I need to change?

mrocklin commented 4 years ago

My guess is that your S3 bucket doesn't give access to list objects, hence the ListObjectsV2 error. Making the bucket listable should resolve this issue. I encourage operating on a per-bucket level rather than a per-object/per-file level.

If you are concerned about making things too public, please note that Coiled will use your local credentials to generate a temporary security token and pass that token to your Dask workers. You should be able to make things readable and listable by a small set of people (just you if you want) and still process your data.

On Tue, Sep 29, 2020 at 2:19 PM Leif Ulstrup notifications@github.com wrote:

@necaris https://github.com/necaris yes but I don't want to paste details here. I set each file for public access. I am able to read each file individually and access via pandas and dask but then when I introduce the * in place of the char that changes between files it gives me that error. I suspect that it is "operator error" by me and a setting. Does the S3 bucket with the files need to have a special type of permission so that the list of the directory items that matches can be queried? It may be a security precaution by AWS to make it harder to query an S3 directory. Is there a bucket-level setting that I need to change?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/coiled/coiled-issues/issues/79#issuecomment-700996765, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTCXIY6HBHSEODNE2H3SIJFPBANCNFSM4R6JHBZA .

leifulstrup commented 4 years ago

@mrocklin thank you. I will try that.

leifulstrup commented 4 years ago

@mrocklin solved. I needed to explicitly add List access in S3. Thanks.

jrbourbeau commented 4 years ago

Thanks for following up @leifulstrup! I'll close this issue for now as it seems to be resolved. Feel free to re-open if needed