Open MichalKoziorowski-TomTom opened 2 years ago
@merlimat @codelipenghui Please take a look at this issue. Is it a gap in PIP-45 changes? We are hitting similar problems in tests, for example #13954.
The issue had no activity for 30 days, mark with Stale label.
The issue had no activity for 30 days, mark with Stale label.
@lhotari this issue looks stale is it still a problem ?
I'm still seeing the same error in 2.9.2 and 2.9.3 , it occasionally happened when we restart one or more ZKs.
same issue
2023-08-25T10:23:08,554+0000 [metadata-store-6-1] WARN org.apache.pulsar.broker.service.ServerCnx - Failed to get Partitioned Metadata [/192.168.49.179:43660] non-persistent://public/devops/fstress-replay-134: org.apache.pulsar.metadata.api.MetadataStoreException$BadVersionException: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /admin/partitioned-topics/public/devops/non-persistent/fstress-replay-134
org.apache.pulsar.metadata.api.MetadataStoreException$AlreadyExistsException: org.apache.pulsar.metadata.api.MetadataStoreException$BadVersionException: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /admin/partitioned-topics/public/devops/non-persistent/fstress-replay-134
at org.apache.pulsar.metadata.cache.impl.MetadataCacheImpl.lambda$create$12(MetadataCacheImpl.java:234) ~[org.apache.pulsar-pulsar-metadata-2.9.3.jar:2.9.3]
at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) ~[?:?]
at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
at org.apache.pulsar.metadata.impl.ZKMetadataStore.lambda$storePut$17(ZKMetadataStore.java:261) ~[org.apache.pulsar-pulsar-metadata-2.9.3.jar:2.9.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.77.Final.jar:4.1.77.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.pulsar.metadata.api.MetadataStoreException$BadVersionException: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /admin/partitioned-topics/public/devops/non-persistent/fstress-replay-134
at org.apache.pulsar.metadata.impl.ZKMetadataStore.getException(ZKMetadataStore.java:345) ~[org.apache.pulsar-pulsar-metadata-2.9.3.jar:2.9.3]
... 5 more
Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /admin/partitioned-topics/public/devops/non-persistent/fstress-replay-134
at org.apache.zookeeper.KeeperException.create(KeeperException.java:122) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.pulsar.metadata.impl.ZKMetadataStore.getException(ZKMetadataStore.java:341) ~[org.apache.pulsar-pulsar-metadata-2.9.3.jar:2.9.3]
... 5 more
same issue with pulsar 2.10.3
2024-05-08T16:26:38,949+0000 [bookkeeper-ml-scheduler-OrderedScheduler-3-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl - [public/bilanx/persistent/enriched-events-partition-2] Failed to initialize managed ledger: org.apache.bookkeeper.mledger.ManagedLedgerException$BadVersionException: org.apache.pulsar.metadata.api.MetadataStoreException$BadVersionException: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /managed-ledgers/public/bilanx/persistent/enriched-events-partition-2
and also observed that many clients were failed to create consumers on the effected topic 😢
Describe the bug Pulsar version: 2.8.2 + branch-2.8 up to https://github.com/apache/pulsar/tree/36443112c86afc296780c180154f922d611ebf25
One of zookeepers died and was replaced in kubernetes. Now we constantly see following error in broker logs:
Before that, when one of zookeeper servers was dying I saw lots of errors like below, but that stopped showing after zookeeper was replaced.