apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.04k stars 1.82k forks source link

[Feature][Connector-V2] Feature title source is file type, sink is sync db or mq, i want do some things after sink,like backup/rename/delete source file,and this opertaion is after sink #8037

Open xinshuaiyong opened 5 days ago

xinshuaiyong commented 5 days ago

Search before asking

Description

要从各类文件系统,同步数据到各类连接器,同步完后要对源文件进行一些操作,需要保证在同步到各类连接器后才去操作源文件,这种情况是在sink块内新开发自定义连接器还是在sink块后再加个模块,用于后置操作,有什么好的处理办法

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

shashwatsai commented 1 day ago

We also have a scenario, where we have an hdfs path as source (an external hive table sits on top of it), we have sink as hdfs path (another external hive table sits on top of it), post the sink task is complete, we need a way to add partition to the hive table through a DDL query (Alter Table Add Partition).

CC: @Hisoka-X, @arshadmohammad

Please suggest an existing way as a workaround for the same.

Suggestive Event Based Approach:

image
Hisoka-X commented 6 hours ago

cc @hailin0

Hisoka-X commented 6 hours ago

Centralized Notification Engine (or extending the existing notification system).

we already have Centralized Notification Engine named EventListener.

On events, such as JobCompletion, JobError, JobFailure.

we should add new event type named JobFinishedEvent, JobFailedEvent etc in https://github.com/apache/seatunnel/blob/3fb05da365649dddaeaf7cc21e167037b3bd40f6/seatunnel-api/src/main/java/org/apache/seatunnel/api/event/EventType.java#L20. Then trigger it on https://github.com/apache/seatunnel/blob/3fb05da365649dddaeaf7cc21e167037b3bd40f6/seatunnel-engine/seatunnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/master/JobMaster.java#L130

Notification listeners have connectors defined in their respective configurations.

users can implement theirself event handler https://github.com/apache/seatunnel/blob/3fb05da365649dddaeaf7cc21e167037b3bd40f6/seatunnel-api/src/main/java/org/apache/seatunnel/api/event/EventHandler.java#L22 or reuse https://github.com/apache/seatunnel/blob/3fb05da365649dddaeaf7cc21e167037b3bd40f6/seatunnel-engine/seatunnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/event/JobEventHttpReportHandler.java#L49