In development, when nodes of the (Zookeeper) cluster are started and stopped often, we randomly receive this exception:
java.lang.NullPointerException at io.vertx.spi.cluster.zookeeper.impl.ZKMap.keyPath(ZKMap.java:70) at io.vertx.spi.cluster.zookeeper.impl.ZKSyncMap.get(ZKSyncMap.java:98) at io.vertx.spi.cluster.zookeeper.impl.ZKSyncMap.lambda$entrySet$5(ZKSyncMap.java:225) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1556) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) at io.vertx.spi.cluster.zookeeper.impl.ZKSyncMap.entrySet(ZKSyncMap.java:227) at io.vertx.core.impl.HAManager.nodeLeft(HAManager.java:323) at io.vertx.core.impl.HAManager.access$100(HAManager.java:94) at io.vertx.core.impl.HAManager$1.nodeLeft(HAManager.java:150) at io.vertx.spi.cluster.zookeeper.ZookeeperClusterManager.childEvent(ZookeeperClusterManager.java:419) at com.neterium.context.ClusterContext$1.childEvent(ClusterContext.java:72) at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:538) at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:532) at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) at org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:530) at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:808) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Just before this, we have this logged by Vertx:
2021-10-14 13:10:52.903 ERROR 34324 --- [ntloop-thread-0] io.vertx.core.net.impl.ConnectionBase : An existing connection was forcibly closed by the remote host
2021-10-14 13:11:13.987 WARN 34324 --- [ChildrenCache-0] i.v.s.cluster.zookeeper.impl.ZKSyncMap : node lost KeeperErrorCode = NoNode for /io.vertx/default/syncMap/__vertx.haInfo/45db3530-9ce0-4c94-9374-1355b20ab753
Do you have a reproducer?
Not really, it happens randomly, not gracefully killing servers increases the odds.
Extra
An easy fix would just be to add some code to check if the "k" argument of the ZKPath::keyPath(k) method is null and, maybe, in ZKSyncMap::get(k) when keyPath returns null (?).
Version
4.0.3
Context
In development, when nodes of the (Zookeeper) cluster are started and stopped often, we randomly receive this exception:
java.lang.NullPointerException at io.vertx.spi.cluster.zookeeper.impl.ZKMap.keyPath(ZKMap.java:70) at io.vertx.spi.cluster.zookeeper.impl.ZKSyncMap.get(ZKSyncMap.java:98) at io.vertx.spi.cluster.zookeeper.impl.ZKSyncMap.lambda$entrySet$5(ZKSyncMap.java:225) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1556) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) at io.vertx.spi.cluster.zookeeper.impl.ZKSyncMap.entrySet(ZKSyncMap.java:227) at io.vertx.core.impl.HAManager.nodeLeft(HAManager.java:323) at io.vertx.core.impl.HAManager.access$100(HAManager.java:94) at io.vertx.core.impl.HAManager$1.nodeLeft(HAManager.java:150) at io.vertx.spi.cluster.zookeeper.ZookeeperClusterManager.childEvent(ZookeeperClusterManager.java:419) at com.neterium.context.ClusterContext$1.childEvent(ClusterContext.java:72) at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:538) at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:532) at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) at org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:530) at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:808) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Just before this, we have this logged by Vertx:
2021-10-14 13:10:52.903 ERROR 34324 --- [ntloop-thread-0] io.vertx.core.net.impl.ConnectionBase : An existing connection was forcibly closed by the remote host 2021-10-14 13:11:13.987 WARN 34324 --- [ChildrenCache-0] i.v.s.cluster.zookeeper.impl.ZKSyncMap : node lost KeeperErrorCode = NoNode for /io.vertx/default/syncMap/__vertx.haInfo/45db3530-9ce0-4c94-9374-1355b20ab753
Do you have a reproducer?
Not really, it happens randomly, not gracefully killing servers increases the odds.
Extra
An easy fix would just be to add some code to check if the "k" argument of the ZKPath::keyPath(k) method is null and, maybe, in ZKSyncMap::get(k) when keyPath returns null (?).