Closed fmiguelez closed 1 year ago
@fmiguelez The autoSkipNonRecoverableData
only works for the ledger does not exist, does not work for some IO exception, this will avoid losing data in some unknown circumstances.
The issue had no activity for 30 days, mark with Stale label.
Closed as answered.
Describe the bug We have a cluster with 3 brokers and 3 bookies. We are facing issues with our sink that writes to an external DB. This sink reads from 36 topics using 3 instances. Each topic data goes to a DB table.
The thing is that now and then our instances die with error below (that does not reach our sink code). Eventually all three instances die.
To Reproduce We are producing data to all these topics with up to 4000 messages per topic every 5 minutes. After some hours (or even minutes) this issue arises.
We need to restart the sink. Sometimes they fail immediately and others they continue processing data.
We have option autoSkipNonRecoverableData enabled on all broker instances.
Expected behavior
Option autoSkipNonRecoverableData should let pulsar client feeding sink continue processing messages, even if any fail.
Ideally these "Bookie handle is not available" errors should not happen.
In any case the sink instances should be able to restart automatically.
Screenshots
Desktop (please complete the following information):
Additional context