Closed avriiil closed 1 year ago
Also note that a python call to the boto3 library (sts get-caller-identity
) seemed to return the right set of credentials when running inside the Airflow runtime. But the Coiled client still failed.
The way we pass aws creds has been worked on a bit and i'm hopeful this is fixed.
When writing data to S3 (with a
ddf.to_parquet()
call) using a Coiled cluster spun up within an Airflow task, credentials are not being passed correctly to the workers, resulting in aPermissionError: Access Denied
, full traceback below.The following works:
storage_options
kwarg of theto_parquet
callThis seems to point to Airflow somehow getting in the way of Coiled passing the AWS credentials stored locally to the Dask workers.
Reproducer (run as Airflow DAG):
Substituting the Coiled cluster above with a Local Dask Cluster runs fine.
My local machine is set up with an IAM user with access key and secret access key set in the
~/.aws/credentials
file under the[default]
role.Full traceback for
PermissionError
error from Airflow logs: