vesoft-inc / nebula-flink-connector

Flink Connector for Nebula Graph
49 stars 30 forks source link

Do we need to support FLIP-143: Unified Sink API? #64

Open spike-liu opened 2 years ago

spike-liu commented 2 years ago

Spilt discuss from https://github.com/vesoft-inc/nebula-flink-connector/issues/38

spike-liu commented 2 years ago

@Nicole00 just finish research of FLIP-143: Unified Sink API and here is my thoughts:

First of all, there is no conflicts between existing implementation vis RichSourceFunction and new API mentioned above. And it is more like that new API provides an improved mechanism for developer to ensure Exactly-once semantics.

image

Secondly, I checked master branch of Flink, currently only Kafka/Kinesis/ElasticSearch/Pulsar implemented new API.

image

Thirdly, JDBC connector also support exactly-once semantics via old CheckpointedFunction/CheckpointListener API.

image

However JDBC SQL connector does not support exactly-once semantics:

image

But JDBC provides exactly-once semantics via JAVA api:

image

Hence it could be good for us to provide an option for user to having exactly-once semantics. Just like Kafka's implementation as below:

image

What is your idea, @Nicole00 ?

liuxiaocs7 commented 2 years ago

same title as #65 ?

spike-liu commented 2 years ago

same title as #65 ?

My mistake, have updated the title of #65

Nicole00 commented 2 years ago

Thanks for your research @spike-liu . I believe it would be more flexible to give the user an option to have exactly-once semantics.

  1. FLIP-143 unified the Data Sink API, which allows the sink implementation just to define WHAT and HOW. And I checked the code of master branch, the old sink package has been deprecated.
  2. Current streaming style sink implementation is not suitable for the bounded scenario. And Nebula actually is a bounded datasource scenario.

So I prefer to refactor the sink api. Mark this issue as feature req.

This could be an issues need to be refactor.