DataLinkDC / dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
http://www.dinky.org.cn
Apache License 2.0
3.1k stars 1.14k forks source link

[Bug] [HIVE] Did not correctly interpret the Hive table creation statement, but the older version 0.7.1 works. #3874

Open chenhaipeng opened 2 hours ago

chenhaipeng commented 2 hours ago

Search before asking

What happened

env info: dinky: v1.10 jdk: jdk8 flink: flink1.14.6

sql is show as flow

SET execution.checkpointing.interval = 12h;
SET execution.checkpointing.tolerable-failed-checkpoints = 3;
SET execution.checkpointing.timeout = 300s;
SET execution.checkpointing.externalized-checkpoint-retention = RETAIN_ON_CANCELLATION;
SET execution.checkpointing.mode = EXACTLY_ONCE;
SET execution.checkpointing.unaligned = true;
SET pipeline.operator-chaining = false;
SET yarn.application.queue=queue_3001_02;
set table.exec.state.ttl=60000;

SET table.sql-dialect=hive;
CREATE CATALOG myhive WITH (
    'type' = 'hive',
    'default-database' = '${imeg_hive_db_name}',
    'hive-conf-dir' = '/appcom/config/hive-config',
    'hadoop-conf-dir'='/appcom/config/hadoop-config'
);

USE CATALOG myhive;
use ${imeg_hive_db_name};
CREATE TABLE IF NOT EXISTS `ilh_turingshield_txn_detl` (
    bizSeqNo string,
    channel string,
    mpcUid string,
    deviceId string,
    riskDeviceId string,
    riskLevel string,
    riskType string,
    extraInfo string,
    requestId string,
    tokenTime string,
    userAction string
) PARTITIONED BY (ds string) STORED AS orc TBLPROPERTIES (
  'partition.time-extractor.timestamp-pattern'='$ds 00:00:00',
  'sink.partition-commit.trigger'='partition-time',
  'sink.partition-commit.delay'='0 s',
  'sink.partition-commit.policy.kind'='metastore',
  'orc.compress' = 'SNAPPY',
  'auto-compaction' = 'true',
  'sink.rolling-policy.rollover-interval' = '12 h',
  'sink.shuffle-by-partition.enable' = 'true',
  'sink.parallelism' = '1'
);

SET table.sql-dialect=default;

drop table if exists ilh_turingshield_txn_kfk;
create table if not exists ilh_turingshield_txn_kfk(
  `event_time` TIMESTAMP(3) METADATA FROM 'timestamp',
   `partition_id` BIGINT METADATA FROM 'partition' VIRTUAL,  -- from Kafka connector
    `offset` BIGINT METADATA VIRTUAL,  -- from Kafka connector
    bizSeqNo string,
    channel string,
    mpcUid string,
    deviceId string,
    riskDeviceId string,
    riskLevel string,
    riskType string,
    extraInfo string,
    requestId string,
    tokenTime string,
    userAction string,
    proctime as PROCTIME(),
    WATERMARK FOR event_time as event_time - INTERVAL '10' SECOND
) WITH (
  'connector' = 'kafka',
  'topic' = 'tcbtrd_ilh_turingshield_txn',
  'properties.bootstrap.servers' = '${kafka_bootstrap_servers}',
  'properties.group.id' = 'turingshield_01',
  'scan.startup.mode' = 'group-offsets',
  'value.format' = 'json',
  'value.json.fail-on-missing-field' = 'false',
  'value.json.ignore-parse-errors' = 'true',
  'properties.security.protocol' = 'SASL_PLAINTEXT',
  'properties.sasl.mechanism' = 'GSSAPI',
  'properties.sasl.kerberos.service.name' = 'hadoop',
  'properties.sasl.jaas.config' = 'com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/appcom/keytab/hduser3001.keytab" storeKey=true useTicketCache=false principal="hduser3001@WEBANK.COM";'
);

INSERT INTO ilh_turingshield_txn_detl 
SELECT 
bizSeqNo,
channel,
mpcUid,
deviceId,
riskDeviceId,
riskLevel,
riskType,
extraInfo,
requestId,
tokenTime,
userAction,
DATE_FORMAT(event_time, 'yyyy-MM-dd')
FROM ilh_turingshield_txn_kfk;

But the prompt is: image but is normal in version v0.7.1 image

What you expected to happen

i debug step by step : in version v1.1.0 ,the currentDialect is DEFATLT image

but in version v0.7.1,the currentDialect is HIVE image

How to reproduce

Hive SQL described above

Anything else

No response

Version

1.1.0

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 2 hours ago

Hello @chenhaipeng, this issue is about CDC/CDCSOURCE, so I assign it to @aiwenmo. If you have any questions, you can comment and reply.

你好 @chenhaipeng, 这个 issue 是关于 CDC/CDCSOURCE 的,所以我把它分配给了 @aiwenmo。如有任何问题,可以评论回复。

github-actions[bot] commented 2 hours ago

Hello @chenhaipeng, this issue is about web, so I assign it to @Zzm0809. If you have any questions, you can comment and reply.

你好 @chenhaipeng, 这个 issue 是关于 web 的,所以我把它分配给了 @Zzm0809。如有任何问题,可以评论回复。