itinycheng / flink-connector-clickhouse

Flink SQL connector for ClickHouse. Support ClickHouseCatalog and read/write primary data, maps, arrays to clickhouse.
Apache License 2.0
363 stars 154 forks source link

读取clickhouse的时候,不会一直读取,加载完了当前时间的数据,就退出了 #62

Closed xyhc closed 1 year ago

xyhc commented 1 year ago
/**
 * 关联clickhouse, 建表a
 */
tEnv.executeSql(
  s"""
     |CREATE TABLE a (
     | channel_id INT,
     | code_id INT,
     | account_id BIGINT,
     | create_time TIMESTAMP_LTZ(3),
     | primary key(channel_id,code_id,account_id) NOT ENFORCED,
     | WATERMARK FOR create_time AS create_time - INTERVAL '1' HOUR
     |) WITH (
     |  'connector' = 'clickhouse',
     |  'url' = '$ckURL',
     |  'database-name' = 'db',
     |  'username' = '$ckUser',
     |  'password' = '$ckPassword',
     |  'table-name' = 'a',
     |  'scan.partition.column' = '$scanPartitionColumn',
     |  'scan.partition.num' = '$scanPartitionNum',
     |  'scan.partition.lower-bound' = '$scanPartitionLowerBound',
     |  'scan.partition.upper-bound' = '$scanPartitionUpperBound'
     |)
     |""".stripMargin
)

val resultTable: Table = tEnv.sqlQuery(
  s"""
     | select *,current_watermark(create_time) from a
     |""".stripMargin
)

tEnv.createTemporaryView("view_res", resultTable)
tEnv.toDataStream(resultTable).print()
itinycheng commented 1 year ago

@xyhc Source Connector 是通过clickhouse-jdbc发送查询语句到ClickHouse实现的(读取有限数据集),不支持类似Flink-CDC的功能;

xyhc commented 1 year ago

@xyhc Source Connector 是通过clickhouse-jdbc发送查询语句到ClickHouse实现的(读取有限数据集),不支持类似Flink-CDC的功能;

明白!谢谢!

xyhc commented 1 year ago

感谢回复!

如果我需要实现持续地读取clickhouse表,有什么方法或者建议吗?

 

------------------ 原始邮件 ------------------ 发件人: "itinycheng/flink-connector-clickhouse" @.>; 发送时间: 2022年11月30日(星期三) 中午11:21 @.>; @.**@.>; 主题: Re: [itinycheng/flink-connector-clickhouse] 读取clickhouse的时候,不会一直读取,加载完了当前时间的数据,就退出了 (Issue #62)

@xyhc Source Connector 是通过clickhouse-jdbc发送查询语句到ClickHouse实现的(读取有限数据集),不支持类似Flink-CDC的功能;

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

itinycheng commented 1 year ago

感谢回复! 如果我需要实现持续地读取clickhouse表,有什么方法或者建议吗?  

ClickHouse貌似没有类似Mysql binlog的功能,没办法做到实时监控数据变更的目的,除非二次开发; 另外可以前置一些代理服务(如nginx),监控实时数据的变化,这种得手动去解析日志,开发成本挺大的; 从业务出发,添加个列更新时间,然后通过定期的select * from t where update_at > $time,但Flink每次执行SQL都是一个新任务,也不能达到你的需求;

xyhc commented 1 year ago

OK, thanks a lot!

ltylty commented 1 year ago

感谢回复! 如果我需要实现持续地读取clickhouse表,有什么方法或者建议吗?   ------------------ 原始邮件 ------------------ 发件人: "itinycheng/flink-connector-clickhouse" @.>; 发送时间: 2022年11月30日(星期三) 中午11:21 @.>; @.**@.>; 主题: Re: [itinycheng/flink-connector-clickhouse] 读取clickhouse的时候,不会一直读取,加载完了当前时间的数据,就退出了 (Issue #62) @xyhc Source Connector 是通过clickhouse-jdbc发送查询语句到ClickHouse实现的(读取有限数据集),不支持类似Flink-CDC的功能; — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

如果有这样的需求应该用hudi