Open singhpk234 opened 1 week ago
LGTM with a minor style issue:
> The following files had format violations:
src/main/java/org/apache/polaris/service/catalog/BasePolarisCatalog.java
@@ -47,7 +47,6 @@
import·org.apache.iceberg.TableMetadata;
import·org.apache.iceberg.TableMetadataParser;
import·org.apache.iceberg.TableOperations;
-import·org.apache.iceberg.aws.s3.S3FileIOProperties;
import·org.apache.iceberg.catalog.Namespace;
import·org.apache.iceberg.catalog.SupportsNamespaces;
import·org.apache.iceberg.catalog.TableIdentifier;
Run './gradlew :polaris-service:spotlessApply' to fix these violations.
checking test failure
on debugging it, looks like we need to handle this change cleanly [1], since polaris supports more endpoint than default impl, let me fix this !
@flyrain @singhpk234 should we update iceberg 1.7.0 dependencies for "getting start" examples as well? Most of them are using 1.5.2 iceberg runtime. I can take this if not already done.
Note: Today I realized that there are two "special" changes in Iceberg/Java 1.7.0:
write.object-storage.enabled=true
+ data path). Before 1.7.0, the paths were generated like s3://bucket/<random-ish-base64>/...
, since 1.7.0 it's s3://bucket/<random-ish-binary-part-1>/<random-ish-binary-part-2>/<random-ish-binary-part-3>/<random-ish-binary-part-4>/...
)ConfigResponse.endpoints()
or that list is empty.Object-storage layout - the generated paths have changed. This is relevant when IAM policies for S3 + GCS are generated. IIRC, IAM policies generated by Polaris are not yet adopted for object-storage layout (write.object-storage.enabled=true + data path). Before 1.7.0, the paths were generated like s3://bucket/
/..., since 1.7.0 it's s3://bucket/ / / / /...)
This is due to the new Object Storage v3 changes introduced in the 1.7 release. However, since Polaris doesn’t currently support the hash prefix before the table path, this update should not impact functionality as far as I understand. Let me know if I’m missing any nuances here!
Iceberg/Java 1.7.0 refuses to submit some requests like the multi-table-commit, if the service does not return ConfigResponse.endpoints() or that list is empty.
Object-storage layout - the generated paths have changed. This is relevant when IAM policies for S3 + GCS are generated. IIRC, IAM policies generated by Polaris are not yet adopted for object-storage layout (write.object-storage.enabled=true + data path). Before 1.7.0, the paths were generated like s3://bucket//..., since 1.7.0 it's s3://bucket/////...)
This is due to the new Object Storage v3 changes introduced in the 1.7 release. However, since Polaris doesn’t currently support the hash prefix before the table path, this update should not impact functionality as far as I understand. Let me know if I’m missing any nuances here!
For Polaris it's the amount of combinations that need to be supported: no object-storage prefix, the old and the new one. For GCS it's easier, because Google accepts regular expressions - S3 however does not.
Iceberg/Java 1.7.0 refuses to submit some requests like the multi-table-commit, if the service does not return ConfigResponse.endpoints() or that list is empty.
383 will resolve this issue, so let's prioritize it. In the short term, it should be manageable since most REST clients haven’t yet adopted the latest 1.7 release.
@flyrain @singhpk234 should we update iceberg 1.7.0 dependencies for "getting start" examples as well? Most of them are using 1.5.2 iceberg runtime. I can take this if not already done.
+1, @MonkeyCanCode! Thanks for picking it up as a follow-up.
@flyrain @singhpk234 should we update iceberg 1.7.0 dependencies for "getting start" examples as well? Most of them are using 1.5.2 iceberg runtime. I can take this if not already done.
+1, @MonkeyCanCode! Thanks for picking it up as a follow-up.
Anytime. Will PR this once current PR is merged. Then i can build it locally for both client (spark) and server (polaris).
I will merge this by EOD if there is no objection.
Here is a PR for the 2 missing changes: https://github.com/apache/polaris/pull/447
Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change.
Upgrade Apache Iceberg to 1.7.0 Additionally :
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test Configuration:
Checklist:
Please delete options that are not relevant.