getindata / flink-http-connector

Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
Apache License 2.0
136 stars 39 forks source link

Sink Writer to write in batches #42

Closed talsheldon closed 1 year ago

talsheldon commented 1 year ago

My understanding is that for every single element (from upstream) there will be an HTTP request for it.

 HttpSink.builder[..]
.setElementConverter(
  (s, _context) => new HttpSinkRequestEntry(
    "POST",
    new Gson().toJson(data).getBytes(StandardCharsets.UTF_8)))

So basically every value from upstream is 1 HttpSinkRequestEntry, which translates later into 1 HTTP request when submitRequestEntries finally executes. I'd like to have less HTTP requests, and batch these requests to a single HTTP request (data to form a list). e.g. a single HTTP request for a batch of size maxBatchSize.

How could I achieve that?

kristoffSC commented 1 year ago

Hi @talsheldon Thanks for the question.

I will replay to you after dec 20th, when i will be back from my vacations.

kristoffSC commented 1 year ago

Hi @talsheldon

My understanding is that for every single element (from upstream) there will be an HTTP request for it.

Yes this is correct. Currently, Http Sink splits collection or requests passed from Flink Via AsyncSink/AsyncSinkWriter into individual requests.

The reason for that was, that back in that time, this was what we needed in our project. The web service that we were suppose to send requests was unable to handle "batch" requests. It simply does not understand a REST Body containing an array of individual requests.

I'd like to have less HTTP requests, and batch these requests to a single HTTP request

Yes this is something that would be a good enhancement for the connector, totally agree.

On the first glance the change we have to implement is twofold. We would have to refactor JavaNetSinkHttpClient::submitRequests method that currently is spiting colletion of elements to individual http reqeusts and also think a little bit about setElementConverter.

If you would be interested in contribution let me know.

shmilygkd commented 1 year ago

Does the current version support this feature?

kristoffSC commented 1 year ago

Hi @shmilygkd unfortunately this feature is still not supported.

Maybe I will be able to find some time to work on it in upcoming future or maybe you would be interested in contribution?

On the first glance the change we have to implement is twofold. We would have to refactor JavaNetSinkHttpClient::submitRequests method that currently is spiting colletion of elements to individual http reqeusts and also think a little bit about setElementConverter.

If you would be interested in contribution let me know.

shmilygkd commented 1 year ago

Well, let's synchronize if there is progress.

kristoffSC commented 1 year ago

@shmilygkd actually I started working on this as we speak :)

shmilygkd commented 1 year ago

Looking forward to it!

kristoffSC commented 1 year ago

It is expected to be ready sometime next week.

kristoffSC commented 1 year ago

This will be released in 0.10.0, currently available on 0.10.0-snapshot

kristoffSC commented 1 year ago

@talsheldon @shmilygkd I've released version 0.10.0 that contains this feature. Feel free to try it :)

shmilygkd commented 1 year ago

@kristoffSC good job~