apache / amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
https://amoro.apache.org/
Apache License 2.0
863 stars 288 forks source link

[Bug]: Multiple entries with same key: watermark.base when creating a new mix_iceberg table #1849

Closed YesOrNo828 closed 1 month ago

YesOrNo828 commented 1 year ago

What happened?

show create table credit.test: credit.test is a mix-hive format table;

CREATE TABLE `arctic_catalog`.`credit`.`test` (
  ........,
  CONSTRAINT `8874a88c-a3fc-4bcf-b889-d944b9c6a615` PRIMARY KEY (`window_start`, `window_end`, `account`, `product_id`) NOT ENFORCED
) PARTITIONED BY (`hour`)
WITH (
  'log-store.enabled' = 'true',
  'snapshot.change.keep.minutes' = '720',
  'table.create-timestamp' = '1681373105955',
  'arctic.server.name' = 'arctic-server',
  'self-optimizing.major.trigger.interval' = '600000',
  'write.metadata.delete-after-commit.enabled' = 'true',
  'optimize.group' = 'nisp_queue',
  'schema.name-mapping.default' = '......',
  'change.data.ttl.minutes' = '180',
  'log-store.data-version' = 'v1',
  'watermark.table' = '1692155633896',
  'log-store.topic' = '......',
  'snapshot.change.keeep.minutes' = '720',
  'ingestion.meta' = '{"fulldataMeta":{"enable":false,"mysqlCluster":null,"mysqlDB":null,"mysqlTable":null,"type":"MYSQL","parallelism":1,"columnMapping":null,"partitionFields":null,"fullDataProperties":null},"realtimeDataMeta":{"enable":false,"kafkaCluster":null,"kafkaTopic":null,"startOffsetType":null,"offsetTimestamp":null,"kafkaSerdeType":null,"parallelism":1,"columnMapping":null,"partitionFields":null,"filterType":null,"filter":null,"enableNdc":false,"ndcUrl":null,"dbHubId":0,"realtimeProperties":null,"sourceInfo":null}}',
  'optimize.full.trigger.max-interval' = '900000',
  'creator.user.id' = '',
  'stream.message.topic.replications' = '2',
  'flink.max-continuous-empty-commits' = '2147483647',
  'table.partition-properties' = '......',
  'stream.message-queue.kafka.name' = '***',
  'write.upsert.enabled' = 'true',
  'format' = 'json',
  'stream.message.topic.partitions' = '10',
  'base.hive.location-root' = 'hdfs://....../hive',
  'log-store.address' = '......',
  'watermark.base' = '1692155633896',
  'self-optimizing.full.trigger.interval' = '900000',
  'creator.user.name' = '......',
  'table.type' = 'ADAPT_HIVE',
  'column.list' = '......'
)

When execute the below create table like statement occurs an exception:

Flink SQL> create table credit.yxx_test_dim_account like credit.`test`;
[ERROR] Could not execute SQL statement. Reason:
java.lang.IllegalArgumentException: Multiple entries with same key: watermark.base=1692155786607 and watermark.base=1692155786607
Caused by: org.apache.flink.table.api.TableException: Could not execute CreateTable in path `arctic_catalog`.`credit`.`yxx_test_dim_account`
        at org.apache.flink.table.catalog.CatalogManager.execute(CatalogManager.java:847) ~[flink-table_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.catalog.CatalogManager.createTable(CatalogManager.java:659) ~[flink-table_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:881) ~[flink-table_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeOperation$3(LocalExecutor.java:209) ~[flink-sql-client_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88) ~[flink-sql-client_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.client.gateway.local.LocalExecutor.executeOperation(LocalExecutor.java:209) ~[flink-sql-client_2.12-1.14.5.jar:1.14.5]
        ... 11 more
Caused by: java.lang.IllegalArgumentException: Multiple entries with same key: watermark.base=1692155786607 and watermark.base=1692155786607
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap.conflictException(ImmutableMap.java:377) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:371) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:241) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.RegularImmutableMap.fromEntryArrayCheckingBucketOverflow(RegularImmutableMap.java:132) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:94) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:573) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap$Builder.buildOrThrow(ImmutableMap.java:601) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.shaded.org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:588) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.table.BasicKeyedTable.properties(BasicKeyedTable.java:134) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.hive.table.KeyedHiveTable.enableSyncHiveSchemaToArctic(KeyedHiveTable.java:98) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.hive.table.KeyedHiveTable.<init>(KeyedHiveTable.java:61) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.hive.catalog.MixedHiveTables.createKeyedTable(MixedHiveTables.java:175) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.catalog.MixedTables.createTableByMeta(MixedTables.java:143) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.catalog.BasicArcticCatalog$ArcticTableBuilder.createTableByMeta(BasicArcticCatalog.java:297) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.catalog.BasicArcticCatalog$ArcticTableBuilder.create(BasicArcticCatalog.java:290) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at com.netease.arctic.flink.catalog.ArcticCatalog.createTable(ArcticCatalog.java:308) ~[amoro-flink-runtime-1.14-0.5.0-SNAPSHOT.jar:?]
        at org.apache.flink.table.catalog.CatalogManager.lambda$createTable$10(CatalogManager.java:661) ~[flink-table_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.catalog.CatalogManager.execute(CatalogManager.java:841) ~[flink-table_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.catalog.CatalogManager.createTable(CatalogManager.java:659) ~[flink-table_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:881) ~[flink-table_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeOperation$3(LocalExecutor.java:209) ~[flink-sql-client_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88) ~[flink-sql-client_2.12-1.14.5.jar:1.14.5]
        at org.apache.flink.table.client.gateway.local.LocalExecutor.executeOperation(LocalExecutor.java:209) ~[flink-sql-client_2.12-1.14.5.jar:1.14.5]
        ... 11 more

If I try to execute the same create table statement occurs another exception:

Flink SQL> create table credit.yxx_test_dim_account like credit.`test`;
2023-08-16 11:28:29,722 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory      [] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
[ERROR] Could not execute SQL statement. Reason:
java.lang.IllegalArgumentException: Table is already existed in hive meta store:arctic_nisp.credit.yxx_test_dim_account

SHOW TABLES in Flink SQL-Client didn't show the target table; But I discovered that the target table is a Hive table and successfully created it in the AMS website.

image

Affects Versions

0.5.0

What engines are you seeing the problem on?

Core, Flink

How to reproduce

No response

Relevant log output

No response

Anything else

No response

Code of Conduct

YesOrNo828 commented 1 year ago

This error is reported when using 0.5.0 version flink code to access 0.4.* AMS.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] commented 1 month ago

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'