Open kwenzh opened 2 years ago
I had a similar problem,broker log :
[BookKeeperClientWorker-OrderedExecutor-1-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L173395 E0-E0, Sent to [pulsar-iot-bookie-1.pulsar-iot-bookie.pulsar-prod.svc.cluster.local:3181, pulsar-iot-bookie-0.pulsar-iot-bookie.pulsar-prod.svc.cluster.local:3181], Heard from [] : bitset = {}, Error = 'No such ledger exists on Bookies'. First unread entry is (-1, rc = null)
Describe : there are a pulsar cluster in k8s,pulsar version 2.8.2。The purpose is replace new disk.
operating steps: 1、run bin/bookkeeper shell bookieformat -deleteCookie -f 2、restart pod 3、wait bookie auto sync data to new disk
business service run normal, but when restart business service, Throws the following exception:
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [so.dian.clover.pulsar.PulsarProducer]: Factory method 'createPulsarProducer' threw exception; nested exception is org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.ExecutionException: org.apache.pulsar.client.api.PulsarClientException: org.apache.pulsar.broker.service.schema.exceptions.SchemaException: No such ledger exists on Bookies - ledger=173395 - operation=Failed to read entry - entry=0 caused by org.apache.pulsar.broker.service.schema.exceptions.SchemaException: No such ledger exists on Bookies - ledger=173395 - operation=Failed to read entry - entry=0
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185)
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:651)
... 41 common frames omitted
The issue had no activity for 30 days, mark with Stale label.
Describe the bug there are a pulsar cluster in k8s
We decided to migrate bookie to a new bookie cluster,
1, we set auto recovery is enabled
i can see recovery is working
old bookie cluster has 5 pod (0-4), first i close bookie-4 but in latest , there are some ledgerId are not migrate successful,
bin/bookkeeper shell listunderreplicated
auto recovery log:
i notice the main error log:
ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Corrupted frame received from bookie: ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L95981 E210-E210, Sent to [pulsar-cluster-bookie-3.pulsar-cluster-bookie.pulsar.svc.cluster.local:3181, pulsar-cluster-bookie-2.pulsar-cluster-bookie.pulsar.svc.cluster.local:3181], Heard from [] : bitset = {}, Error = 'Bookie handle is not available'. First unread entry is (-1, rc = null)
the ledger id metadata
To Reproduce Steps to reproduce the behavior:
Expected behavior recovery log is normally
Screenshots
Desktop (please complete the following information):
Additional context Add any other context about the problem here.