smallrye / smallrye-reactive-messaging

SmallRye Reactive Messaging
http://www.smallrye.io/smallrye-reactive-messaging/
Apache License 2.0
235 stars 176 forks source link

Kafka batch processing always duplicates multiples of max.poll.records #2549

Open itatdcer opened 6 months ago

itatdcer commented 6 months ago

Using the option mp.messaging.incoming.$channel.batch=true, always result in duplicate records, multiple of max.poll.records configuration.

In a processing "pipeline" like

`@Incoming("inbound-channel") @Outgoing("outbound-channel") public Multi<Message<byte[]>> process(Message<List<byte[]>> msg) {

    return Multi.createFrom()
            .items(() -> {

                return service.apply(msg).stream()
                        .map(result -> {

                            msg.ack().toCompletableFuture().complete(null);
                            if (result != null) {
                                return result;
                            }

                            return null;

                        });

            })
            .runSubscriptionOn(Infrastructure.getDefaultWorkerPool())
            .onCompletion().ifEmpty().continueWith(() -> {
                msg.ack().toCompletableFuture().complete(null);
                return List.of();
            });
}`

The output topic is always filled with the number of input messages plus n x "max.poll.records`".

SivaM07 commented 4 months ago

@itatdcer, I recently debugged duplicates on the non batch message processing, let me know if you are still trying to troubleshoot, I can help you with that

cescoffier commented 4 months ago

Do you have a reproducer?