apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.46k stars 2.24k forks source link

Nested namespace support is broken in 1.7.0 #11539

Open mayankvadariya opened 3 days ago

mayankvadariya commented 3 days ago

Apache Iceberg version

1.7.0 (latest release)

Query engine

Trino

Please describe the bug 🐞

Nested namespace request to Iceberg catalog server is sent incorrectly after upgrading Iceberg library from 1.6.1 to 1.7.0 in Trino. I've provided further analysis in https://github.com/apache/iceberg/pull/10858#issuecomment-2471482802 After https://github.com/trinodb/trino/commit/ead6d9f7dc7fe94321bcf452dc90f26eb0d3a2f5 commit(which bumps Iceberg from 1.61. to 1.7.0), show schemas fails in Iceberg connector with below error.

iceberg.catalog.type=rest
iceberg.rest-catalog.uri=
iceberg.rest-catalog.warehouse=
iceberg.rest-catalog.security=OAUTH2
iceberg.rest-catalog.oauth2.credential=
iceberg.rest-catalog.oauth2.scope=PRINCIPAL_ROLE:ALL
iceberg.rest-catalog.nested-namespace-enabled=true
trino:tpch> create schema level1;
CREATE SCHEMA
trino:tpch> create schema iceberg."level1.level2";
CREATE SCHEMA
trino:tpch> show schemas;

Query 20241113_175503_00027_msjru, FAILED, 3 nodes
http://localhost:8080/ui/query.html?20241113_175503_00027_msjru
Splits: 17 total, 1 done (5.88%)
CPU Time: 0.0s total,     0 rows/s,     0B/s, 20% active
Per Node: 0.0 parallelism,     0 rows/s,     0B/s
Parallelism: 0.1
Peak Memory: 984B
0.07 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20241113_175503_00027_msjru failed: Error listing schemas for catalog iceberg: Namespace does not exist: level1%1Flevel2
io.trino.spi.TrinoException: Error listing schemas for catalog iceberg: Namespace does not exist: level1%1Flevel2
        at io.trino.metadata.MetadataListing.handleListingException(MetadataListing.java:358)
        at io.trino.metadata.MetadataListing.listSchemas(MetadataListing.java:99)
        at io.trino.metadata.MetadataListing.listSchemas(MetadataListing.java:90)
        at io.trino.connector.informationschema.InformationSchemaPageSource.addSchemataRecords(InformationSchemaPageSource.java:331)
        at io.trino.connector.informationschema.InformationSchemaPageSource.buildPages(InformationSchemaPageSource.java:227)
        at io.trino.connector.informationschema.InformationSchemaPageSource.getNextPage(InformationSchemaPageSource.java:185)
        at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:268)
        at io.trino.operator.Driver.processInternal(Driver.java:403)
        at io.trino.operator.Driver.lambda$process$8(Driver.java:306)
        at io.trino.operator.Driver.tryWithLock(Driver.java:709)
        at io.trino.operator.Driver.process(Driver.java:298)
        at io.trino.operator.Driver.processForDuration(Driver.java:269)
        at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:890)
        at io.trino.execution.executor.dedicated.SplitProcessor.run(SplitProcessor.java:77)
        at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.lambda$run$0(TaskEntry.java:201)
        at io.trino.$gen.Trino_testversion____20241113_175430_71.run(Unknown Source)
        at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.run(TaskEntry.java:202)
        at io.trino.execution.executor.scheduler.FairScheduler.runTask(FairScheduler.java:172)
        at io.trino.execution.executor.scheduler.FairScheduler.lambda$submit$0(FairScheduler.java:159)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1575)
Caused by: org.apache.iceberg.exceptions.NoSuchNamespaceException: Namespace does not exist: level1%1Flevel2
        at org.apache.iceberg.rest.ErrorHandlers$NamespaceErrorHandler.accept(ErrorHandlers.java:173)
        at org.apache.iceberg.rest.ErrorHandlers$NamespaceErrorHandler.accept(ErrorHandlers.java:166)
        at org.apache.iceberg.rest.HTTPClient.throwFailure(HTTPClient.java:211)
        at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:323)
        at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:262)
        at org.apache.iceberg.rest.HTTPClient.get(HTTPClient.java:358)
        at org.apache.iceberg.rest.RESTClient.get(RESTClient.java:96)
        at org.apache.iceberg.rest.RESTSessionCatalog.listNamespaces(RESTSessionCatalog.java:630)
        at io.trino.plugin.iceberg.catalog.rest.TrinoRestCatalog.collectNamespaces(TrinoRestCatalog.java:170)
        at io.trino.plugin.iceberg.catalog.rest.TrinoRestCatalog.lambda$collectNamespaces$0(TrinoRestCatalog.java:173)
        at java.base/java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:289)
        at java.base/java.util.Collections$2.tryAdvance(Collections.java:5075)
        at java.base/java.util.Collections$2.forEachRemaining(Collections.java:5083)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
        at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:727)
        at io.trino.plugin.iceberg.catalog.rest.TrinoRestCatalog.collectNamespaces(TrinoRestCatalog.java:174)
        at io.trino.plugin.iceberg.catalog.rest.TrinoRestCatalog.lambda$collectNamespaces$0(TrinoRestCatalog.java:173)
        at java.base/java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:289)
        at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:1024)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
        at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:727)
        at io.trino.plugin.iceberg.catalog.rest.TrinoRestCatalog.collectNamespaces(TrinoRestCatalog.java:174)
        at io.trino.plugin.iceberg.catalog.rest.TrinoRestCatalog.listNamespaces(TrinoRestCatalog.java:161)
        at io.trino.plugin.iceberg.IcebergMetadata.listSchemaNames(IcebergMetadata.java:436)
        at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.listSchemaNames(ClassLoaderSafeConnectorMetadata.java:200)
        at io.trino.tracing.TracingConnectorMetadata.listSchemaNames(TracingConnectorMetadata.java:133)
        at io.trino.metadata.MetadataManager.listSchemaNames(MetadataManager.java:260)
        at io.trino.tracing.TracingMetadata.listSchemaNames(TracingMetadata.java:172)
        at io.trino.metadata.MetadataListing.doListSchemas(MetadataListing.java:105)
        at io.trino.metadata.MetadataListing.listSchemas(MetadataListing.java:96)
        ... 24 more

Willingness to contribute

nastra commented 1 day ago

@mayankvadariya can you provide additional details against which REST server this is running? Are there any reproducible tests in Trino itself to easily run & reproduce this? I believe the issue is that a call to RESTUtil.decodeNamespace(parent) is missing (similar to what has been added in https://github.com/apache/iceberg/pull/10858/files#diff-68e9dc9e0447ceb6ee81935a693797a7228775601f6366f50db42b3faef47ec3R291)

nastra commented 1 day ago

Currently I cannot reproduce this issue with the below Trino test:

--- a/plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/catalog/rest/TestIcebergRestCatalogNestedNamespaceConnectorSmokeTest.java
+++ b/plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/catalog/rest/TestIcebergRestCatalogNestedNamespaceConnectorSmokeTest.java
@@ -13,6 +13,7 @@
  */
 package io.trino.plugin.iceberg.catalog.rest;

+import com.google.common.collect.ImmutableList;
 import com.google.common.collect.ImmutableMap;
 import io.airlift.http.server.testing.TestingHttpServer;
 import io.trino.filesystem.Location;
@@ -22,6 +23,7 @@ import io.trino.plugin.iceberg.SchemaInitializer;
 import io.trino.plugin.iceberg.TestingIcebergPlugin;
 import io.trino.plugin.tpch.TpchPlugin;
 import io.trino.testing.DistributedQueryRunner;
+import io.trino.testing.MaterializedRow;
 import io.trino.testing.QueryRunner;
 import io.trino.testing.TestingConnectorBehavior;
 import org.apache.iceberg.BaseTable;
@@ -149,6 +151,18 @@ final class TestIcebergRestCatalogNestedNamespaceConnectorSmokeTest
         assertQueryFails("SELECT * FROM nested_namespace_disabled.\"level_1.level_2\".region", "Nested namespace is not enabled for this catalog");
     }

+    @Test
+    void testNestedNamespace()
+    {
+        assertUpdate("CREATE SCHEMA iceberg.first");
+        assertUpdate("CREATE SCHEMA iceberg.\"first.second\"");
+        assertUpdate("CREATE SCHEMA iceberg.\"first.second.third\"");
+        assertThat(computeActual("show schemas from iceberg"))
+                .contains(new MaterializedRow(ImmutableList.of("first")),
+                        new MaterializedRow(ImmutableList.of("first.second")),
+                        new MaterializedRow(ImmutableList.of("first.second.third")));
+    }
+
bryanck commented 1 day ago

I can reproduce this if I use Trino w/ Iceberg 1.7(commit ead6d9f) connecting to an Iceberg REST catalog running Iceberg 1.6 (using the Tabular image). If I update the image to Iceberg 1.7, I don't get this error.

flyrain commented 1 day ago

Polaris hits the same issue. The potential fix seems a good candidate for 1.7.1.