line / decaton

High throughput asynchronous task processing on Apache Kafka
Apache License 2.0
336 stars 51 forks source link

Enqueue different taskData on retry (partial retry support) #199

Open bitterfox opened 1 year ago

bitterfox commented 1 year ago

We often use decaton to process a bunch of targets as a single task for better batching to middleware or downstream. For example, update 100 users' data, delete 100 linked data when unregistration and so on.

Such kinds of tasks might hit a partial failure due to downstream issues like DB failure, or so on. e.g. some targets succeeded, but some failed In such a case, we'd like to retry partially to avoid duplicated operations for successful targets or avoid unnecessary load downstream.

However, the current decaton doesn't support such scenarios in native and we retry the whole task or implement such logic by ourselves touching TaskMetadata, DecatonTaskRetryQueueingProcessor and handling defer completion.

Retry the whole task is not preferred, in addition to the above reason, because succeeded targets in the previous attempt might fail in the retry attempt and eventually the task is not considered as succeeded even though each target succeeded eventually.

Can decaton support such retry in ProcessingContext?

I'd like to do following:

process(ProcessingContext<MyTask> context, MyTask task) {
  List<Id> targets = task.getTargetIds();
  List<Id> failedTargets = doMyProcess(targets);

  if (failedTargets.isNotEmpty() && maxRetryCount > context.metadata().retryCount()) {
    context.retry(new MyTask(failedTargets));
  }
}

Currently, ProcessingContextImpl doesn't know how to serialize task to byte[], so to implement this function, we may pass serializer to ProcessingContext in anyway, or retry signature will be retry(byte[]), however, it might not a good signature.