apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.12k stars 3.57k forks source link

Open MLPendingAckStore managedLedger failed #11772

Open xuesongxs opened 3 years ago

xuesongxs commented 3 years ago

Describe the bug Pulsar v2.8.0 Cluster Info: 1、3 zookeepers 2、3 bookies 3、1 broker

Step: 1、set transactionCoordinatorEnabled=true in broker.conf 2、Consumer Code:

            ConsumerBuilder consumerBuilder = getClient().newConsumer(Schema.STRING);
            List<String> topics = new ArrayList<>();
            topics.add("persistent://public/default/test-string112");
            topics.add("persistent://public/default/test-string113");
            consumerBuilder.topics(topics)
                    .subscriptionName("consumerSych")
                    .subscriptionType(SubscriptionType.Shared)
                    .subscriptionInitialPosition(SubscriptionInitialPosition.Earliest);

            Consumer<String> consumer = consumerBuilder.subscribe();
            int count = 0;
            while (true) {
                Message msg = consumer.receive();
                consumer.acknowledge(msg);
                System.out.println("message:" + msg.getMessageId().toString() + ",curTime:" + System.currentTimeMillis());
            }

3、Start Consumer Consumer log:

2021-08-25 14:28:01.800 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string113-partition-0][consumerSych] Failed to subscribe to topic on /172.32.149.123:16650  
2021-08-25 14:28:01.801 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xdfab9ac0, L:/172.32.147.245:11860 - R:/172.32.149.123:16650] 8 request timedout after ms 30000  
2021-08-25 14:28:01.801 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string113-partition-1][consumerSych] Failed to subscribe to topic on /172.32.149.123:16650  
2021-08-25 14:28:01.802 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xdfab9ac0, L:/172.32.147.245:11860 - R:/172.32.149.123:16650] 9 request timedout after ms 30000  
2021-08-25 14:28:01.802 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string113-partition-2][consumerSych] Failed to subscribe to topic on /172.32.149.123:16650  
2021-08-25 14:28:01.802 [pulsar-client-io-1-1] WARN  o.a.pulsar.client.impl.MultiTopicsConsumerImpl - [MultiTopicsConsumer-f65a1] Failed to subscribe for topic [persistent://public/default/test-string113] in topics consumer org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 8 request timedout after ms 30000  
2021-08-25 14:28:01.804 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xdfab9ac0, L:/172.32.147.245:11860 - R:/172.32.149.123:16650] 10 request timedout after ms 30000  
2021-08-25 14:28:01.804 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string112-partition-1][consumerSych] Failed to subscribe to topic on /172.32.149.123:16650  
2021-08-25 14:28:01.804 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xdfab9ac0, L:/172.32.147.245:11860 - R:/172.32.149.123:16650] 11 request timedout after ms 30000  
2021-08-25 14:28:01.804 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string112-partition-0][consumerSych] Failed to subscribe to topic on /172.32.149.123:16650  
2021-08-25 14:28:01.805 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xdfab9ac0, L:/172.32.147.245:11860 - R:/172.32.149.123:16650] 12 request timedout after ms 30000  
2021-08-25 14:28:01.805 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string112-partition-2][consumerSych] Failed to subscribe to topic on /172.32.149.123:16650  
2021-08-25 14:28:01.805 [pulsar-client-io-1-1] WARN  o.a.pulsar.client.impl.MultiTopicsConsumerImpl - [MultiTopicsConsumer-f65a1] Failed to subscribe for topic [persistent://public/default/test-string112] in topics consumer org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 12 request timedout after ms 30000  
2021-08-25 14:28:01.806 [pulsar-client-io-1-1] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xdfab9ac0, L:/172.32.147.245:11860 - R:/172.32.149.123:16650] 13 request timedout after ms 30000  
2021-08-25 14:28:01.806 [pulsar-external-listener-3-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string113-partition-2] [consumerSych] Closed Consumer (not connected)  
2021-08-25 14:28:01.807 [pulsar-external-listener-3-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string113-partition-1] [consumerSych] Closed Consumer (not connected)  
2021-08-25 14:28:01.808 [pulsar-external-listener-3-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string113-partition-0] [consumerSych] Closed Consumer (not connected)  
2021-08-25 14:28:01.808 [pulsar-external-listener-3-1] WARN  o.a.pulsar.client.impl.MultiTopicsConsumerImpl - [MultiTopicsConsumer-f65a1] Failed to subscribe for topic [persistent://public/default/test-string113] in topics consumer, subscribe error: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 8 request timedout after ms 30000  
2021-08-25 14:28:01.808 [pulsar-external-listener-3-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string112-partition-1] [consumerSych] Closed Consumer (not connected)  
2021-08-25 14:28:01.808 [pulsar-external-listener-3-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string112-partition-2] [consumerSych] Closed Consumer (not connected)  
2021-08-25 14:28:01.808 [pulsar-external-listener-3-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/test-string112-partition-0] [consumerSych] Closed Consumer (not connected)  
2021-08-25 14:28:01.809 [pulsar-external-listener-3-1] WARN  o.a.pulsar.client.impl.MultiTopicsConsumerImpl - [MultiTopicsConsumer-f65a1] Failed to subscribe for topic [persistent://public/default/test-string112] in topics consumer, subscribe error: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 12 request timedout after ms 30000  
2021-08-25 14:28:01.809 [pulsar-external-listener-3-1] WARN  o.a.pulsar.client.impl.MultiTopicsConsumerImpl - [MultiTopicsConsumer-f65a1] Failed to subscribe topics: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 12 request timedout after ms 30000, closing consumer  
2021-08-25 14:28:01.809 [pulsar-external-listener-3-1] INFO  o.a.pulsar.client.impl.MultiTopicsConsumerImpl - [MultiTopicsConsumer-f65a1] [consumerSych] Closed Topics Consumer  
org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 12 request timedout after ms 30000

Process finished with exit code -1

4、vi broker.log:

14:27:26.172 [bookkeeper-ml-scheduler-OrderedScheduler-2-0] WARN  org.apache.pulsar.common.naming.TopicName - Could not get the partition index from the topic persistent://public/default/test-string112-partition-0-consumerSych__transaction_pending_ack
14:27:26.173 [bookkeeper-ml-scheduler-OrderedScheduler-2-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - Opening managed ledger public/default/persistent/test-string112-partition-0-consumerSych__transaction_pending_ack
14:27:26.181 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/test-string112-partition-0-consumerSych__transaction_pending_ack] Closing managed ledger
14:27:26.181 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] ERROR org.apache.pulsar.broker.transaction.pendingack.impl.MLPendingAckStoreProvider - Open MLPendingAckStore managedLedger failed.
org.apache.bookkeeper.mledger.ManagedLedgerException$ManagedLedgerNotFoundException: org.apache.bookkeeper.mledger.ManagedLedgerException$MetadataNotFoundException: Managed ledger not found
Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException$MetadataNotFoundException: Managed ledger not found
14:27:26.181 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] ERROR org.apache.pulsar.broker.transaction.pendingack.impl.PendingAckHandleImpl - PendingAckHandleImpl init fail! TopicName : persistent://public/default/test-string112-partition-0, SubName: consumerSych
java.util.concurrent.CompletionException: org.apache.bookkeeper.mledger.ManagedLedgerException$ManagedLedgerNotFoundException: org.apache.bookkeeper.mledger.ManagedLedgerException$MetadataNotFoundException: Managed ledger not found
    at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:661) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_251]
    at org.apache.pulsar.broker.transaction.pendingack.impl.MLPendingAckStoreProvider$1.openLedgerFailed(MLPendingAckStoreProvider.java:81) ~[org.apache.pulsar-pulsar-broker-2.8.0.jar:2.8.0]
    at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.lambda$asyncOpen$8(ManagedLedgerFactoryImpl.java:402) ~[org.apache.pulsar-managed-ledger-2.8.0.jar:2.8.0]
    at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_251]
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_251]
    at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl$2.initializeFailed(ManagedLedgerFactoryImpl.java:397) ~[org.apache.pulsar-managed-ledger-2.8.0.jar:2.8.0]
    at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl$1.operationFailed(ManagedLedgerImpl.java:396) ~[org.apache.pulsar-managed-ledger-2.8.0.jar:2.8.0]
    at org.apache.bookkeeper.mledger.impl.MetaStoreImpl.lambda$getManagedLedgerInfo$2(MetaStoreImpl.java:87) ~[org.apache.pulsar-managed-ledger-2.8.0.jar:2.8.0]
    at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670) [?:1.8.0_251]
    at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646) [?:1.8.0_251]
    at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) [?:1.8.0_251]
    at org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:203) [org.apache.bookkeeper-bookkeeper-common-4.14.1.jar:4.14.1]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_251]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_251]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_251]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_251]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_251]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_251]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.63.Final.jar:4.1.63.Final]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_251]
Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException$ManagedLedgerNotFoundException: org.apache.bookkeeper.mledger.ManagedLedgerException$MetadataNotFoundException: Managed ledger not found
Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException$MetadataNotFoundException: Managed ledger not found
14:27:38.009 [pulsar-load-manager-1-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Only 1 broker available: no load shedding will be performed
lhotari commented 3 years ago

@xuesongxs the description doesn't explain how to reproduce this issue. Please provide a detailed procedure for reproducing the issue. Providing that information will increase the chances that your problem gets fixed. Reproducing issues is a large part of the work in fixing issues.

xuesongxs commented 3 years ago

@xuesongxs the description doesn't explain how to reproduce this issue. Please provide a detailed procedure for reproducing the issue. Providing that information will increase the chances that your problem gets fixed. Reproducing issues is a large part of the work in fixing issues.

I modified the description of the issue and added the step description.

codelipenghui commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.