streamnative / pulsar-archived

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org
Apache License 2.0
72 stars 26 forks source link

ISSUE-13967: [Functions] ZK metadata badly initialised after PR13296 #3641

Open sijie opened 2 years ago

sijie commented 2 years ago

Original Issue: apache/pulsar#13967


Describe the bug

When running distributed system tests using Data Generator Pulsar IO sources, this error message is produced by the function workers:

2022-01-25T01:20:51,212+0000 [function-web-26-5] INFO  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Initialize rackaware ensemble placement policy @ <Bookie:10.244.4.9:0> @ /default-region/default-rack : org.apache.distributedlog.net.DNSResolverForRacks.
2022-01-25T01:20:51,212+0000 [function-web-26-5] INFO  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Not weighted
2022-01-25T01:20:51,232+0000 [function-web-26-5] INFO  org.apache.bookkeeper.client.BookKeeper - Weighted ledger placement is not enabled
2022-01-25T01:20:51,242+0000 [function-web-26-5] ERROR org.apache.bookkeeper.client.BookieWatcherImpl - Failed to get bookie list : 
org.apache.bookkeeper.client.BKException$ZKException: Error while using ZooKeeper
    at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$4(ZKRegistrationClient.java:351) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1177) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:678) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:563) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /available
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
    at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$4(ZKRegistrationClient.java:350) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    ... 3 more
2022-01-25T01:20:51,243+0000 [function-web-26-5] ERROR org.apache.distributedlog.bk.SimpleLedgerAllocator - Error creating ledger for allocating /pulsar/functions/public/default/data-generator-source/b2da946a-c387-44c8-a7ed-fee73c040c6e-pulsar-io-data-generator-2.10.0-SNAPSHOT.nar/<default>/allocation : 
java.io.IOException: org.apache.bookkeeper.client.BKException$ZKException: Error while using ZooKeeper
    at org.apache.distributedlog.BookKeeperClient.commonInitialization(BookKeeperClient.java:123) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BookKeeperClient.initialize(BookKeeperClient.java:172) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BookKeeperClient.get(BookKeeperClient.java:199) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BookKeeperClient.createLedger(BookKeeperClient.java:211) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.bk.SimpleLedgerAllocator.allocateLedger(SimpleLedgerAllocator.java:370) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.bk.SimpleLedgerAllocator.allocate(SimpleLedgerAllocator.java:271) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.bk.LedgerAllocatorDelegator.allocate(LedgerAllocatorDelegator.java:67) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.impl.logsegment.BKLogSegmentAllocator.allocate(BKLogSegmentAllocator.java:55) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKLogWriteHandler.doStartLogSegment(BKLogWriteHandler.java:571) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKLogWriteHandler$10.onSuccess(BKLogWriteHandler.java:538) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKLogWriteHandler$10.onSuccess(BKLogWriteHandler.java:530) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:42) ~[org.apache.bookkeeper-bookkeeper-common-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:26) ~[org.apache.bookkeeper-bookkeeper-common-4.14.4.jar:4.14.4]
    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
    at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:883) ~[?:?]
    at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2251) ~[?:?]
    at org.apache.distributedlog.BKLogWriteHandler.asyncStartLogSegment(BKLogWriteHandler.java:530) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAbstractLogWriter.lambda$asyncStartNewLogSegment$1(BKAbstractLogWriter.java:379) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1106) ~[?:?]
    at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) ~[?:?]
    at org.apache.distributedlog.BKAbstractLogWriter.asyncStartNewLogSegment(BKAbstractLogWriter.java:378) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAbstractLogWriter.rollLogSegmentIfNecessary(BKAbstractLogWriter.java:517) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAsyncLogWriter.doGetLogSegmentWriter(BKAsyncLogWriter.java:226) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAsyncLogWriter.getLogSegmentWriter(BKAsyncLogWriter.java:211) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAsyncLogWriter.getLogSegmentWriter(BKAsyncLogWriter.java:249) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAsyncLogWriter.rollLogSegmentAndIssuePendingRequests(BKAsyncLogWriter.java:344) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAsyncLogWriter.asyncWrite(BKAsyncLogWriter.java:302) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.BKAsyncLogWriter.write(BKAsyncLogWriter.java:420) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.distributedlog.AppendOnlyStreamWriter.write(AppendOnlyStreamWriter.java:50) ~[org.apache.distributedlog-distributedlog-core-4.14.4.jar:4.14.4]
    at org.apache.pulsar.functions.worker.dlog.DLOutputStream.write(DLOutputStream.java:59) ~[org.apache.pulsar-pulsar-functions-worker-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at org.apache.pulsar.functions.worker.dlog.DLOutputStream.write(DLOutputStream.java:54) ~[org.apache.pulsar-pulsar-functions-worker-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at org.apache.pulsar.functions.worker.WorkerUtils.uploadToBookKeeper(WorkerUtils.java:100) ~[org.apache.pulsar-pulsar-functions-worker-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at org.apache.pulsar.functions.worker.WorkerUtils.uploadFileToBookkeeper(WorkerUtils.java:75) ~[org.apache.pulsar-pulsar-functions-worker-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at org.apache.pulsar.functions.worker.rest.api.ComponentImpl.getFunctionPackageLocation(ComponentImpl.java:306) ~[org.apache.pulsar-pulsar-functions-worker-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at org.apache.pulsar.functions.worker.rest.api.SourcesImpl.registerSource(SourcesImpl.java:227) ~[org.apache.pulsar-pulsar-functions-worker-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at org.apache.pulsar.functions.worker.rest.api.v3.SourcesApiV3Resource.registerSource(SourcesApiV3Resource.java:73) ~[org.apache.pulsar-pulsar-functions-worker-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
    at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:159) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) ~[org.glassfish.jersey.core-jersey-common-2.34.jar:?]
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) ~[org.glassfish.jersey.core-jersey-common-2.34.jar:?]
    at org.glassfish.jersey.internal.Errors.process(Errors.java:292) ~[org.glassfish.jersey.core-jersey-common-2.34.jar:?]
    at org.glassfish.jersey.internal.Errors.process(Errors.java:274) ~[org.glassfish.jersey.core-jersey-common-2.34.jar:?]
    at org.glassfish.jersey.internal.Errors.process(Errors.java:244) ~[org.glassfish.jersey.core-jersey-common-2.34.jar:?]
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) ~[org.glassfish.jersey.core-jersey-common-2.34.jar:?]
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:680) ~[org.glassfish.jersey.core-jersey-server-2.34.jar:?]
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394) ~[org.glassfish.jersey.containers-jersey-container-servlet-core-2.34.jar:?]
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346) ~[org.glassfish.jersey.containers-jersey-container-servlet-core-2.34.jar:?]
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:366) ~[org.glassfish.jersey.containers-jersey-container-servlet-core-2.34.jar:?]
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:319) ~[org.glassfish.jersey.containers-jersey-container-servlet-core-2.34.jar:?]
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205) ~[org.glassfish.jersey.containers-jersey-container-servlet-core-2.34.jar:?]
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799) ~[org.eclipse.jetty-jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1631) ~[org.eclipse.jetty-jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.apache.pulsar.broker.web.AuthenticationFilter.doFilter(AuthenticationFilter.java:81) ~[org.apache.pulsar-pulsar-broker-common-2.10.0-SNAPSHOT.jar:2.10.0-SNAPSHOT]
    at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) ~[org.eclipse.jetty-jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) ~[org.eclipse.jetty-jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) ~[org.eclipse.jetty-jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501) ~[org.eclipse.jetty-jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:179) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.Server.handle(Server.java:516) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:400) ~[org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:645) [org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:392) [org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) [org.eclipse.jetty-jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) [org.eclipse.jetty-jetty-io-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) [org.eclipse.jetty-jetty-io-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) [org.eclipse.jetty-jetty-io-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) [org.eclipse.jetty-jetty-util-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) [org.eclipse.jetty-jetty-util-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) [org.eclipse.jetty-jetty-util-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) [org.eclipse.jetty-jetty-util-9.4.44.v20210927.jar:9.4.44.v20210927]
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) [org.eclipse.jetty-jetty-util-9.4.44.v20210927.jar:9.4.44.v20210927]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final]
    at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.bookkeeper.client.BKException$ZKException: Error while using ZooKeeper
    at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$4(ZKRegistrationClient.java:351) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1177) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:678) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:563) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /available
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
    at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$4(ZKRegistrationClient.java:350) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1177) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:678) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:563) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
2022-01-25T01:20:51,245+0000 [function-web-26-5] INFO  org.apache.distributedlog.bk.SimpleLedgerAllocator - Ledger allocator /pulsar/functions/public/default/data-generator-source/b2da946a-c387-44c8-a7ed-fee73c040c6e-pulsar-io-data-generator-2.10.0-SNAPSHOT.nar/<default>/allocation moved to phase ERROR : version = 0.
2022-01-25T01:20:51,627+0000 [DLM-/pulsar/functions-OrderedScheduler-0-0] INFO  org.apache.distributedlog.bk.SimpleLedgerAllocator - Abort ledger allocator without cleaning up on /pulsar/functions/public/default/data-generator-source/b2da946a-c387-44c8-a7ed-fee73c040c6e-pulsar-io-data-generator-2.10.0-SNAPSHOT.nar/<default>/allocation.
2022-01-25T01:20:51,629+0000 [function-web-26-5] ERROR org.apache.pulsar.functions.worker.rest.api.SourcesImpl - Failed process Source public/default/data-generator-source package: 
org.apache.distributedlog.exceptions.WriteException: Write rejected because stream public/default/data-generator-source/b2da946a-c387-44c8-a7ed-fee73c040c6e-pulsar-io-data-generator-2.10.0-SNAPSHOT.nar has encountered an error : writer has been closed due to error.

It seems that the metadata is not properly set as the ZK node /available is not initialised. After some debugging on a healthy cluster, my understanding is that the path there should be /ledgers/available and that's the reason why we fail to get the list of bookies.

To Reproduce

  1. Build a docker image from master
  2. Deploy a pulsar cluster on kubernetes (for example by the DataStax helmchart) and instantiate a function producer, e.g.:
apiVersion: batch/v1
kind: Job
metadata:
  name: function-producer-starter
spec:
  template:
    spec:
      containers:
        - name: function-producer-starter
          image: master-image/pulsar-all
          command:
            - /bin/sh
            - -c
            - >-
              env &&
              /pulsar/bin/apply-config-from-env.py /pulsar/conf/client.conf &&
              bin/pulsar-admin sources create
              -t data-generator --name data-generator-source
              --source-config '{"sleepBetweenMessages":"10"}'
              --destination-topic-name persistent://public/default/test
      restartPolicy: Never

Desktop

Additional context @nicoloboschi and I found that this regression was introduced by #13296, as all children commits are affected, whereas its parent (#13683) is not.

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.