If you are writing to an S3 bucket and configure your S3 credentials using a user-provided function using S3Config.credentials_provider, it does not currently pass those credentials on to our writer, so it will fail to authenticate.
Daft should behave the same between reads (which currently work) and writes. It should fetch the credentials from the credentials provider (or cached credentials if already fetched and not expired) and pass it along to the PyArrow writer.
Describe the bug
If you are writing to an S3 bucket and configure your S3 credentials using a user-provided function using
S3Config.credentials_provider
, it does not currently pass those credentials on to our writer, so it will fail to authenticate.To Reproduce
The following will fail:
Expected behavior
Daft should behave the same between reads (which currently work) and writes. It should fetch the credentials from the credentials provider (or cached credentials if already fetched and not expired) and pass it along to the PyArrow writer.
Component(s)
Parquet, CSV, Other
Additional context
Relevant part of the code where we set the PyArrow filesystem credentials for writing: https://github.com/Eventual-Inc/Daft/blob/main/daft/filesystem.py#L215-L235