Describe the bug
In our staging cluster, some bookie instances keep logging No ledger found while reading entry: xx from ledger: xxxx, I did some investigation and found the internal metadata of LedgerHandle(org.apache.bookkeeper.client.LedgerHandle#versionedMetadata) of specific ledgers are inconsistent with latest ledger metadata in zookeeper.
Screenshots
Bookie logs:
On that bookie, it says the readEntry request is from 172.30.10.3:48336 which is a broker instance.
So the broker keep sent readEntry requests to wrong bookies which are not in ledger ensemble list.
Desktop (please complete the following information):
OS:
# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)
Additional context
Bookie auto-recovery is enabled.
Machine 172.30.10.2 has crashed once at 2022-01-03 17:40:08 +08:00, and bookie instance started at 2022-01-03 19:35:08 +08:00, No ledger found log start shows after 172.30.10.2 crashed.
Original Issue: apache/pulsar#13693
Describe the bug In our staging cluster, some bookie instances keep logging
No ledger found while reading entry: xx from ledger: xxxx
, I did some investigation and found the internal metadata ofLedgerHandle
(org.apache.bookkeeper.client.LedgerHandle#versionedMetadata) of specific ledgers are inconsistent with latest ledger metadata in zookeeper.Screenshots Bookie logs: On that bookie, it says the
readEntry
request is from172.30.10.3:48336
which is a broker instance.On broker instance
172.30.10.3:8080
, it saysensembles
are172.30.10.5:3181, 172.30.10.2:3181
:But the latest ledger metadata shows
ensembles
are172.30.10.4:3181, 172.30.10.3:3181
:So the broker keep sent readEntry requests to wrong bookies which are not in ledger ensemble list.
Desktop (please complete the following information):
Additional context Bookie auto-recovery is enabled.
Machine
172.30.10.2
has crashed once at2022-01-03 17:40:08 +08:00
, and bookie instance started at2022-01-03 19:35:08 +08:00
,No ledger found
log start shows after172.30.10.2
crashed.Possible related https://github.com/apache/pulsar/issues/7214