confluentinc / kafka-connect-hdfs

Kafka Connect HDFS connector
Other
12 stars 396 forks source link

Allow to limit retry write errors by timeout #663

Open JozoVilcek opened 1 year ago

JozoVilcek commented 1 year ago

When write failure occur, connect task handles it by backoff sleep and retry until writes recovers. In my case, when facing infrastructure problems and writing to HDFS, write HDFS pipeline is pinned to try through same nodes. When infrastructure takes long to recover, connect task gains delay.

I would like to be able to set an upper bound for retries in which case operation is reset and temp file recreated. This will initiate new write pipeline and have a chance to complete write via different HDFS nodes.

Currently, this scenario is working effectively when using e.g. WALL time based partitioner, where triggered rotation will very likely trigger and error while attempting to close open file and initiate a reset. With record based time partitioner which does not work because "time do not move"