Open Raven888888 opened 3 years ago
Could you provide your KoP version? And can you reproduce it?
KoP: 2.8.1.4 I faced this issue after upgrading from pulsar 2.7.0 to 2.8.1. Will try to reproduce when I have the capacity.
upgrading from pulsar 2.7.0 to 2.8.1
Is it the similar issue like https://github.com/streamnative/kop/issues/765?
Nope @BewareMyPower . That issue only happens in pulsar + kop 2.7.0, and it is due to how kop handles offset calculation as you mentioned. Once upgraded to 2.8.1, I have not seen the offset gap thus far.
This current issue somehow causes broker unable to start at all. Consequently, kop is also not functional.
Yeah, I also noticed it.
Caused by: org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException: The subscription c-pulsar-cluster-1-fw-192.168.3.3-8080-function-metadata-tailer-reader-7223e8a3e9 of the topic persistent://public/functions/metadata gets the last message id was failed
Failed to get batch size for entry java.lang.IllegalArgumentException: Invalid unknonwn tag type: 6
at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:987) ~[org.apache.pulsar-pulsar-client-api-2.8.1.jar:2.8.1]
at org.apache.pulsar.client.impl.ConsumerImpl.hasMessageAvailable(ConsumerImpl.java:1869) ~[org.apache.pulsar-pulsar-client-original-2.8.1.jar:2.8.1]
at org.apache.pulsar.client.impl.ReaderImpl.hasMessageAvailable(ReaderImpl.java:168) ~[org.apache.pulsar-pulsar-client-original-2.8.1.jar:2.8.1]
at org.apache.pulsar.functions.worker.FunctionMetaDataManager.initialize(FunctionMetaDataManager.java:109) ~[org.apache.pulsar-pulsar-functions-worker-2.8.1.jar:2.8.1]
at org.apache.pulsar.functions.worker.PulsarWorkerService.start(PulsarWorkerService.java:496) ~[org.apache.pulsar-pulsar-functions-worker-2.8.1.jar:2.8.1]
at org.apache.pulsar.broker.PulsarService.startWorkerService(PulsarService.java:1569) ~[org.apache.pulsar-pulsar-broker-2.8.1.jar:2.8.1]
at org.apache.pulsar.broker.PulsarService.start(PulsarService.java:759) ~[org.apache.pulsar-pulsar-broker-2.8.1.jar:2.8.1]
It looks like there's something wrong with topic persistent://public/functions/metadata
. If you enabled Pulsar Functions before enabling KoP, this topic might not contain the BrokerEntryMetadata
part. After enabling KoP, the reader of this topic might fail by ConsumerImpl.hasMessageAvailable
.
I'm not familiar with Pulsar Functions, it might take some time to look into the reason.
@BewareMyPower yes you are right. Pulsar function was enabled before enabling KoP. However, just a side note, back in pulsar 2.7.0, such setup wasn't an issue.
Correct me if I am wrong, pulsar functions are stateless, so I could just drop persistent://public/functions/metadata
and re-install the pulsar function again?
It looks like I only replied via email so it didn't appear in this issue.
IMO, it's correct and you can delete this topic manually. Have you tried it?
No I haven't @BewareMyPower , I will try it when I have the capacity, thanks.
The issue had no activity for 30 days, mark with Stale label.
Hey, for what it's worth, I'm seeing this in Pulsar 2.10.1 with KoP:
2022-07-28T12:54:19,013+0000 [BookKeeperClientWorker-OrderedExecutor-7-0] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [adtonos-platform-prod/platform/persistent/AvailablePlayoutRecognized-partition-12] Opened new cursor: NonDurableCursorImpl{ledger=adtonos-platform-prod/platform/persistent/AvailablePlayoutRecognized-partition-12, ackPos=42:-1, readPos=42:0}
2022-07-28T12:54:19,016+0000 [BookKeeperClientWorker-OrderedExecutor-9-0] ERROR io.streamnative.pulsar.handlers.kop.MessageFetchContext - [persistent://adtonos-platform-prod/platform/AvailablePlayoutRecognized-14] Failed to peekOffsetFromEntry from position 35:4: No BrokerEntryMetadata found
2022-07-28T12:54:19,016+0000 [BookKeeperClientWorker-OrderedExecutor-7-0] ERROR io.streamnative.pulsar.handlers.kop.MessageFetchContext - [persistent://adtonos-platform-prod/platform/AvailablePlayoutRecognized-12] Failed to peekOffsetFromEntry from position 42:4: No BrokerEntryMetadata found
2022-07-28T12:54:19,016+0000 [BookKeeperClientWorker-OrderedExecutor-9-0] ERROR io.streamnative.pulsar.handlers.kop.MessageFetchContext - Read entry error on (offset=774095, logStartOffset=-1, maxBytes=1048576)
io.streamnative.pulsar.handlers.kop.exceptions.MetadataCorruptedException$NoBrokerEntryMetadata: No BrokerEntryMetadata found
at io.streamnative.pulsar.handlers.kop.utils.MessageMetadataUtils.peekOffset(MessageMetadataUtils.java:127) ~[?:?]
at io.streamnative.pulsar.handlers.kop.utils.MessageMetadataUtils.peekOffsetFromEntry(MessageMetadataUtils.java:118) ~[?:?]
at io.streamnative.pulsar.handlers.kop.MessageFetchContext$2.readEntriesComplete(MessageFetchContext.java:555) ~[?:?]
at org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$checkReadCompletion$2(OpReadEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.10.1.jar:2.10.1]
at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.10.1.jar:2.10.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.5.jar:4.14.5]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.77.Final.jar:4.1.77.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]
2022-07-28T12:54:19,016+0000 [BookKeeperClientWorker-OrderedExecutor-7-0] ERROR io.streamnative.pulsar.handlers.kop.MessageFetchContext - Read entry error on (offset=897452, logStartOffset=-1, maxBytes=1048576)
io.streamnative.pulsar.handlers.kop.exceptions.MetadataCorruptedException$NoBrokerEntryMetadata: No BrokerEntryMetadata found
at io.streamnative.pulsar.handlers.kop.utils.MessageMetadataUtils.peekOffset(MessageMetadataUtils.java:127) ~[?:?]
at io.streamnative.pulsar.handlers.kop.utils.MessageMetadataUtils.peekOffsetFromEntry(MessageMetadataUtils.java:118) ~[?:?]
at io.streamnative.pulsar.handlers.kop.MessageFetchContext$2.readEntriesComplete(MessageFetchContext.java:555) ~[?:?]
at org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$checkReadCompletion$2(OpReadEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.10.1.jar:2.10.1]
at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.10.1.jar:2.10.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.5.jar:4.14.5]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.77.Final.jar:4.1.77.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]
2022-07-28T12:54:19,019+0000 [BookKeeperClientWorker-OrderedExecutor-10-0] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [adtonos-platform-prod/platform/persistent/AvailablePlayoutRecognized-partition-0] Unable to find position for predicate FindEntryByOffset{ 778995}. Use the first position 27:0 instead.
But I'm able to start the broker - this is at run-time.
It kind of works, but I'm getting a constant spam of this message in the logs.
the same problem i meet, the broker cluster is unavailable because of function worker. the error is "The subscription c-pulsar-fw-pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local.-8080-function-metadata-tailer-reader-28f8e65105 of the topic persistent://public/functions/metadata gets the last message id was failed"
Describe the bug Exactly same as #11774. Unable to start broker. Also similar to #10967.
Error log
Expected behavior Able to restart and run broker without any error. Claimed to be fixed by #10968, but I still face the same issue.
Additional context Pulsar version: 2.8.1 Pulsar mode: cluster, 3 nodes KoP: enabled (#10950)