Closed lavinia-k closed 4 years ago
This was most likely fixed with https://github.com/intake/filesystem_spec/pull/181, which is included in fsspec 0.6.0. Let us know if anyone can reproduce with the latest versions of fsspec and s3fs.
This is happening to me. I too use .read/to_parquet
and have:
fsspec==0.6.2
s3fs==0.4.0
Step functions, running exactly the same data/process and will occasionally fail with permission denied, but nothing changes between runs.
I appreciate it's hard to debug on Lambda, but can you tell from logs if this is running out of retries, or if its a type of error that s3fs doesn't realise should be retried?
Note that setting S3FS_LOGGING_LEVEL=DEBUG
will bring you much more information.
This was the output, but have since switched to boto3
directly for reading. If you'd like I can revert the changes and run it again with S3FS_LOGGING_LEVEL=DEBUG
if what's provided below isn't detailed enough:
{
"error": "PermissionError",
"cause": {
"errorMessage": "Access Denied",
"errorType": "PermissionError",
"stackTrace": [
" File \"/var/task/leroydw/processing/common.py\", line 42, in _wrapper\n return fn(*args, **kwargs)\n",
" File \"/var/task/leroydw/processing/common.py\", line 42, in _wrapper\n return fn(*args, **kwargs)\n",
" File \"/var/task/leroydw/processing/bgoinnova/process.py\", line 22, in process_file\n df = pd.read_parquet(f's3://{event[\"File\"]}', engine=\"pyarrow\")\n",
" File \"/tmp/sls-py-req/pandas/io/parquet.py\", line 310, in read_parquet\n return impl.read(path, columns=columns, **kwargs)\n",
" File \"/tmp/sls-py-req/pandas/io/parquet.py\", line 121, in read\n path, _, _, should_close = get_filepath_or_buffer(path)\n",
" File \"/tmp/sls-py-req/pandas/io/common.py\", line 185, in get_filepath_or_buffer\n filepath_or_buffer, encoding=encoding, compression=compression, mode=mode\n",
" File \"/tmp/sls-py-req/pandas/io/s3.py\", line 48, in get_filepath_or_buffer\n file, _fs = get_file_and_filesystem(filepath_or_buffer, mode=mode)\n",
" File \"/tmp/sls-py-req/pandas/io/s3.py\", line 38, in get_file_and_filesystem\n file = fs.open(_strip_schema(filepath_or_buffer), mode)\n",
" File \"/tmp/sls-py-req/fsspec/spec.py\", line 724, in open\n **kwargs\n",
" File \"/tmp/sls-py-req/s3fs/core.py\", line 315, in _open\n autocommit=autocommit, requester_pays=requester_pays)\n",
" File \"/tmp/sls-py-req/s3fs/core.py\", line 957, in __init__\n cache_type=cache_type)\n",
" File \"/tmp/sls-py-req/fsspec/spec.py\", line 956, in __init__\n self.details = fs.info(path)\n",
" File \"/tmp/sls-py-req/s3fs/core.py\", line 486, in info\n return super().info(path)\n",
" File \"/tmp/sls-py-req/fsspec/spec.py\", line 497, in info\n out = self.ls(self._parent(path), detail=True, **kwargs)\n",
" File \"/tmp/sls-py-req/s3fs/core.py\", line 529, in ls\n files = self._ls(path, refresh=refresh)\n",
" File \"/tmp/sls-py-req/s3fs/core.py\", line 426, in _ls\n return self._lsdir(path, refresh)\n",
" File \"/tmp/sls-py-req/s3fs/core.py\", line 349, in _lsdir\n raise translate_boto_error(e)\n"
]
}
}
I'm afraid that stack trace is not particularly useful. boto alone had no problems? It may be worth trying with s3fs master, where the parent dir should not be listed for this case.
Didn't think it would be, sorry. Using boto has ran 6 times in a row without incident. Typically, the best with s3fs was ~4 times before this failure would happen. I'll try master and let you know.
I haven't been able to reproduce it using master. I'm no expert in the code base, but I don't really see any changes since 0.4.0 that would affect this, maybe it's a Heisenberg... :man_shrugging: Anyway, thanks for the help, I'll report back if something similar arises. :+1:
Heisenberg it is
hey, reproduced it with my use, using these dependencies
boto3==1.12.28
botocore==1.15.28
dask[dataframe]==2.12.0
docutils==0.15.2
fastparquet==0.3.3
fsspec==0.6.3
jmespath==0.9.5
llvmlite==0.31.0
locket==0.2.0
numba==0.48.0
numpy==1.18.2
pandas==1.0.3
partd==1.1.0
python-dateutil==2.8.1
pytz==2019.3
s3fs==0.4.0
s3transfer==0.3.3
six==1.14.0
thrift==0.13.0
toolz==0.10.0
urllib3==1.25.8 ; python_version != '3.4'
Error wise i just get
[ERROR] PermissionError: Access Denied
Usually I will have a large burst of small files with many concurrent lambdas, so probably reusing the same lambda context all the time. let me know if there are any other logs that i can bring to help fix this
Downgraded to s3fs 0.2.0, can confirm it works well.
dependencies used that definitely work:
boto3==1.12.31
botocore==1.15.31
cloudpickle==1.3.0
dask[dataframe]==2.1.0
docutils==0.15.2
fastparquet==0.3.3
jmespath==0.9.5
llvmlite==0.31.0
locket==0.2.0
numba==0.48.0
numpy==1.18.2
pandas==0.25.3
partd==1.1.0
python-dateutil==2.8.1
pytz==2019.3
s3fs==0.2.0
s3transfer==0.3.3
six==1.14.0
thrift==0.13.0
toolz==0.10.0
urllib3==1.25.8 ; python_version != '3.4'
@GurRonenExplorium , it would be very helpful for us if you could do additional logging/delving, to find out if any of the calls were different during the reads which failed.
I have the same issue but on EC2 and not lambda In Jupyter I tried to read a file created after the notebook was generated (the file was created in another server) and it failed, but after I restarted the kernel - it worked
I guess it is related to this s3fs cache https://github.com/dask/dask/issues/5134
upgrading s3fs
to 0.4.2
solved the issue
This is still happening on s3fs==2023.1.0
Here are some logs:
[ERROR] PermissionError: Access Denied
Traceback (most recent call last):
File "/var/task/lambda_handler.py", line 113, in magic_function
results = pd.read_parquet(s3_valid_path)
File "/var/lang/lib/python3.7/site-packages/pandas/io/parquet.py", line 317, in read_parquet
return impl.read(path, columns=columns, **kwargs)
File "/var/lang/lib/python3.7/site-packages/pandas/io/parquet.py", line 142, in read
path, columns=columns, filesystem=fs, **kwargs
File "/var/lang/lib/python3.7/site-packages/pyarrow/parquet/core.py", line 2952, in read_table
thrift_container_size_limit=thrift_container_size_limit,
File "/var/lang/lib/python3.7/site-packages/pyarrow/parquet/core.py", line 2465, in __init__
finfo = filesystem.get_file_info(path_or_paths)
File "pyarrow/_fs.pyx", line 571, in pyarrow._fs.FileSystem.get_file_info
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/_fs.pyx", line 1490, in pyarrow._fs._cb_get_file_info
File "/var/lang/lib/python3.7/site-packages/pyarrow/fs.py", line 332, in get_file_info
info = self.fs.info(path)
File "/var/lang/lib/python3.7/site-packages/fsspec/asyn.py", line 114, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/var/lang/lib/python3.7/site-packages/fsspec/asyn.py", line 99, in sync
raise return_result
File "/var/lang/lib/python3.7/site-packages/fsspec/asyn.py", line 54, in _runner
result[0] = await coro
File "/var/lang/lib/python3.7/site-packages/s3fs/core.py", line 1271, in _info
**self.req_kw,
File "/var/lang/lib/python3.7/site-packages/s3fs/core.py", line 340, in _call_s3
method, kwargs=additional_kwargs, retries=self.retries
File "/var/lang/lib/python3.7/site-packages/s3fs/core.py", line 139, in _error_wrapper
raise err
@adminy , we are now onto 2023.5.0 - do you mind trying again? You might also try activating the region cache (cache_regions=True) or setting the default region via environment variable or the client_kwargs.
I can't update. There is a strict requirement to use python 3.7 at the monent.
there is a Region environment variable set in the lambda.
I'm running a Python 3.7 script in AWS Lambda, which runs queries against AWS Athena and tries to download the CSV results file that Athena stores on S3 once the query execution has completed.
Any ideas why I'd be running into the error below intermittently?
s3_query_result_path = f's3://{bucket}/{results_key}'
As you can see above, I'm using the pandas library, which then uses s3fs under the hood.
The Lambda works about 80% of the time, and I can't figure out anything unique about the times it fails.
Feel free to let me know if I should be posting this question in pandas or elsewhere instead - thanks for your help!