fsspec / s3fs

S3 Filesystem
http://s3fs.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
892 stars 274 forks source link

botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden #754

Open oanamoc opened 1 year ago

oanamoc commented 1 year ago

I have the following code that counts words but I am getting an error on this line: fhand = s3.open('lithops-data-yey/notsobigtextfile_small.txt') I am not sure why it says forbidden and don't know where to start in solving it.

The error is this: Exception has occurred: PermissionError Forbidden botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

The above exception was the direct cause of the following exception:

File "/Users/oanamoc/Desktop/Work/lithops/task1 copy.py", line 20, in fhand = s3.open('lithops-data-yey/notsobigtextfile_small.txt') PermissionError: Forbidden

My code is here:

from lithops.multiprocessing import Pool import time import s3fs

s3 = s3fs.S3FileSystem(anon=True)

def count_words(lines): count = 0 for line in lines: line = line.rstrip() pieces = line.split() for initial_word in pieces: word = ''.join(c for c in initial_word if c.isalpha()) if word: # Skip empty words count = count + 1 return count

if name == 'main': fhand = s3.open('lithops-data-yey/notsobigtextfile_small.txt')

start = time.perf_counter()

word_number = 0
lines_per_chunk = 1000  # Number of lines to process per chunk

with Pool() as p:
    chunks = []
    while True:
        chunk = []
        for _ in range(lines_per_chunk):
            line = fhand.readline()
            if not line:
                break
            chunk.append(line)
        if not chunk:
            break
        chunks.append(chunk)

    line_numbers = p.map(count_words, chunks)

word_number = sum(line_numbers)

print('Word Count:', word_number)

finish = time.perf_counter()

print(f'Finished in {round(finish - start, 2)} second(s)')
martindurant commented 1 year ago

Could it be that indeed you don't have permission to view that file? Where did you get this URL from?

oanamoc commented 1 year ago

I have a bucket named lithops-data-yey and I put a file named notsobigtextfile_small.txt in it. I didn't take it from the internet or anything.

martindurant commented 1 year ago

I put a file named notsobigtextfile_small.txt in it

How did you do this exactly? Did you make it public?

oanamoc commented 1 year ago

No, should I? I have a file with my credentials so theoretically it should be able to access it. At least that's what I thought.

martindurant commented 1 year ago
s3 = s3fs.S3FileSystem(anon=True)

says you don't want to use your credentials

oanamoc commented 1 year ago

You are right, I didn't see it.

I corrected it but now I have another issue.

Exception has occurred: FSTimeoutError Read timeout on endpoint URL: "https://lithops-data-yey.s3.eu-west-3.amazonaws.com/notsobigtextfile_small.txt" aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socket

During handling of the above exception, another exception occurred:

aiobotocore.response.AioReadTimeoutError: Read timeout on endpoint URL: "https://lithops-data-yey.s3.eu-west-3.amazonaws.com/notsobigtextfile_small.txt"

The above exception was the direct cause of the following exception:

File "/Users/oanamoc/Desktop/Work/lithops/task1 copy.py", line 35, in line = fhand.readline() fsspec.exceptions.FSTimeoutError:

martindurant commented 1 year ago

Is eu-west-3 indeed the region of your bucket? The exception looks like a network problem, or the url is wrong.

oanamoc commented 1 year ago

yes it is, I checked again. I agree the url given in the exception is wrong, but the line fhand = s3.open('lithops-data-yey/notsobigtextfile_small.txt') should be right according to the examples I have seen.

martindurant commented 1 year ago

Do you have a default region set in your environment? Can you try to set one? Also, can you try with cache_regions=False or True when instantiating the filesystem?

martindurant commented 1 year ago

Perhaps the same as https://github.com/aio-libs/aiobotocore/issues/1024 ? Can you try doing this directly with aiobotocore and see if you get the same error?