Closed c-thiel closed 2 weeks ago
I believe we should enable the storage we want in iceberg
crate.
I believe we should enable the storage we want in
iceberg
crate.
Will this cause the library to include unnecessary dependencies? I think using features to control what's included would be better? And we can provide a all-io
feature flag for convenience?
cc @Xuanwo @Fokko
BTW how the integration tests passed when s3 is not included? I think at least this test can't pass?
BTW how the integration tests passed when s3 is not included? I think at least this test can't pass?
Hm, good point.
We are testing via cargo test --no-fail-fast --all-targets --all-features --workspace
Could it sneak its way in somehow via dev-dependencies of catalogs or datafusion?: https://github.com/apache/iceberg-rust/blob/4407634d4e7760afabd8789f585cfee0e6b19e8f/crates/catalog/rest/Cargo.toml#L49
Good catch! Thanks @c-thiel for the finding!
Maybe we can add features like storage-s3
, storage-gcs
at the iceberg side.
Maybe we can add features like
storage-s3
,storage-gcs
at the iceberg side.
+1
Let me fix this.
After the opendal update to opendal 0.4.6 in https://github.com/apache/iceberg-rust/commit/3b8121eaa9e9628536093836dcc41119716afd9e#diff-2e9d962a08321605940b5a657135052fbcef87b5e360662bb527c96d9a615542, the iceberg crate no longer supports S3 on its own.
I assume this is because in opendal 0.4.6 the "default" features changed to https://github.com/apache/opendal/blob/main/core/Cargo.toml#L42
default = ["reqwest/rustls-tls", "executors-tokio", "services-memory"]
From https://github.com/apache/opendal/blob/v0.45.0/core/Cargo.toml:
However, the iceberg crate holds
FileIO
, so we should expose S3 functionality at least via a feature. I see that i.e. the catalog crates all specify some opendal feature flag.Maybe @Xuanwo can have a look since you know both sides?
Original Error upon calling
.new_output('s3://some/path')
on FileIO:Source: Unsupported (permanent) at , context: { scheme: s3 } => scheme is not enabled or supported