getindata / flink-http-connector

Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
Apache License 2.0
136 stars 39 forks source link

How should I restart Flink Job when HTTP client response fails ? #71

Closed PeatBoy closed 3 months ago

PeatBoy commented 7 months ago

Thank you for your open source project !

During my usage, I need to implement a job restart when I do not receive a response from the HTTP client or when the response content is a failure. I have noticed that the flink-http-connector currently does not implement the retry process. I attempted to trigger Flink Job's restart through the mechanism of HttpPostRequestCallback by throwing RuntimeException, but it did not take effect. I would like to ask if there are other feasible ways to report HTTP-Sink exceptions to Flink and trigger job restart.

Looking forward to a reply. thanks

kristoffSC commented 5 months ago

Hi @PeatBoy, sorry for a long delay.

TBH, when we were designing HTTP connector the decision was that HTTP endpoint failure should not trigger job restart. Its a common thing that HTTP endpoints can suffer occasionally failures -> timeouts, 500 errors (due to high load) ect.

I understand that what you need is to stop job when Endpoint is not repressive rather restart it and try again right? Since restart of the job will have no effect on the viability of the endpoint. I guess stooping the job -> Using Flink NoRestartPolicy can have its benefits but at the same time I think it could make job unstable. Probably adding retry along with this feature would be good,

Or maybe you need only some metrics about failed requests?