Open munip opened 2 months ago
I think the error might be coming from the underlying pyarrow.fs.S3FileSystem
class which is used to interact with s3
https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html
Not sure if this currently supports S3 Express One Zone right now.
According to this thread, PyArrow does not currently support it https://github.com/lancedb/lancedb/issues/1206
Thanks kevinjqliu. From the other thread it doesn't look like pyarrow supports S3 express one. Does anyone know timelines for Express One Zone support?
@muniatl The best place to reach out would be the Arrow mailing list: https://lists.apache.org/list.html?dev@arrow.apache.org
Arrow mailing list would be a good place to start.
PyIceberg depends on pyarrow to support s3 express one zone. I've found https://github.com/apache/arrow-rs/issues/5140 which adds support for the arrow rust library. It'll be great to open an issue with pyarrow to track support for s3 express one zone.
Question
I have been able to access a S3 bucket with pyIceberg using SqlCatalog successfully with catalog = SqlCatalog( "default", { "uri": f"sqlite:///{warehouse_path}/pyiceberg_catalog.db", "warehouse": "s3://myicebergbkt/test", "s3.access-key-id": "myid", "s3.secret-access-key": "mykey", "s3.session-token":"my-token" "s3.region": "us-east-1" }, ) But, when I try accessing the same with S3 express one bucket, I am stuck on the syntax. Tried all options with no luck: catalog = SqlCatalog( "default", { "uri": f"sqlite:///{warehouse_path}/pyiceberg_catalog.db", "warehouse": "s3://us-east-1:730335207565:bucket/pyicebkt--use1-az4--x-s3/test", # I have also tried 730335207565:bucket/pyicebkt--use1-az4--x-s3 and just pyicebkt--use1-az4--x-s3 with no lcuk "s3.access-key-id": "myid", "s3.secret-access-key": "mykey", "s3.session-token":"my-token" "s3.region": "us-east-1" }, )
I get the error : " Expected an S3 object path of the form 'bucket/key...', got a URI: " Is S3 express one zone supported? If so, what is the syntax for warehouse variable?