DataLinkDC / dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
http://www.dinky.org.cn
Apache License 2.0
3.14k stars 1.15k forks source link

[Feature][core] Flink SQL task Support insert result preview #3893

Open MactavishCui opened 6 days ago

MactavishCui commented 6 days ago

Search before asking

Description

A short description of your feature

Insert result of flink SQL task can be previewed as well as debug insert result do not effect production environment data

Why this feature is useful to most users?

Situation Currently, dinky only supports previewing the results of a single Select statement, it is inconvenient during Flink SQL task debug. As is shown above, every sink function of a multi-sink Flink SQL task needs a Select statement to check the insert. Then select statement must be changed to Insert before production environment deployment, while errors may happen during statements changing. Besides, frequent statement changing is time wasting for users.

Additional information:

The real-time data warehouse platform of Meituan NAU has implemented similar feature. Insert function is mocked when debug. Debug results are inserted and selected by S3. Statements are not needed to changed before deployment. Multi-insert task's result preview is also supported. And data of production environment will not be affected. REF: https://zhuanlan.zhihu.com/p/532657279

Use case

A possible solution:

Dinky supports the settings including job auto cancel, maximum catch rows etc. and data preview is implemented by SelectResult.class. In order to be compatible with historical logic like mentioned above and reuse previous code as much as possible, I designed the following scheme to implement this feature: A customized connector is designed to save insert data to accumulators. That means results can be caught by TableResult.class. Also, that means results can be handled in the similar way as SelectResult.class, more codes can be reused. And the connector options will be changed to the customized mock connector by SqlExplainer if the Task is set to be mocked. Solution MockFunction

My attempt

Based on the scheme mentioned above, I have implemented this feature in my local repository. Here lists the results: image If this issue is allowed to submit a pr, I will submit by following steps:

  1. Customized connector
  2. MockExplainer.class: Implement of Insert statements change to mock statements based on template.
  3. Code change of module core and admin: Implement of backend part.
  4. Implement of front part.

Looking forward to your reply and dissicussion.

Related issues

No.

Are you willing to submit a PR?

Code of Conduct

aiwenmo commented 5 days ago

Your idea is very good. Our IDE is currently being refactored. Please implement it based on the latest IDE. We are looking forward to your code.

MactavishCui commented 4 days ago

@aiwenmo I have submitted a draft PR [https://github.com/DataLinkDC/dinky/pull/3897](). I find that both https://github.com/DataLinkDC/dinky/pull/3889 and https://github.com/DataLinkDC/dinky/pull/3854 are working on data studio or SQL execute refactoring and both of them have '1.,2.0' milestone tag. Is all refactors will be included in version 1.2.0? Can I submit my PR with new refactored code after the publication of dinky 1.2.0? I will refactor my codes of this feature and switch the status to 'ready for review' if all the IDE refactors you mentioned is ready. Looking forward to your reply.

Zzm0809 commented 3 days ago

@MactavishCui Thank you for your contribution. As you can see, the task submission process and the overall DataStudio module on the front end are being refactored. This refacturing will be released in 1.2.0, so your pr needs to wait for the above two prs to be merged. Make synchronous changes. I suggest you pay attention to the progress of the above two prs. Of course, the above PR merger will also notify you.

MactavishCui commented 2 days ago

@MactavishCui Thank you for your contribution. As you can see, the task submission process and the overall DataStudio module on the front end are being refactored. This refacturing will be released in 1.2.0, so your pr needs to wait for the above two prs to be merged. Make synchronous changes. I suggest you pay attention to the progress of the above two prs. Of course, the above PR merger will also notify you.

@Zzm0809 Thanks for your reply! I will read the refactored code, redesign my scheme implement, make synchronous changes and update the draft PR linked to this issue after the merger of related PRs. And I will keep on working on other issues, hope to make more contributions to dinky!

Zzm0809 commented 2 days ago

@MactavishCui Thank you for your contribution. As you can see, the task submission process and the overall DataStudio module on the front end are being refactored. This refacturing will be released in 1.2.0, so your pr needs to wait for the above two prs to be merged. Make synchronous changes. I suggest you pay attention to the progress of the above two prs. Of course, the above PR merger will also notify you.

@Zzm0809 Thanks for your reply! I will read the refactored code, redesign my scheme implement, make synchronous changes and update the draft PR linked to this issue after the merger of related PRs. And I will keep on working on other issues, hope to make more contributions to dinky!

ok, thanks!!

Pirate5946 commented 1 day ago

借帖子提个问题哈,dinky V1.1.0 是否支持 Flink SQL 1.19 ,一个任务写多个 insert into 语句 到 不同 sink ? @Zzm0809 @aiwenmo

image
Zzm0809 commented 1 day ago

借帖子提个问题哈,dinky V1.1.0 是否支持 Flink SQL 1.19 ,一个任务写多个 insert into 语句 到 不同 sink ? @Zzm0809 @aiwenmo

image

支持。直接编写多个SQL即可默认已经开启语句集