apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.23k stars 3.58k forks source link

function EFFECTIVELY_ONCE cannot support input topic batching #17061

Open yapxue opened 2 years ago

yapxue commented 2 years ago

Search before asking

Motivation

In pulsar function EFFECTIVELY_ONCE semantic, producer attach a sequenceId to output topic for deduplication, the sequenceId consists of ledgerId and entryId. But for batched messages, the share the same entryId so they have same sequenceId, others will become duplicated and dropped, this may cause data loss.

Solution

maybe put batchIndex in sequenceId

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.