vert-x3 / vertx-hazelcast

Hazelcast Cluster Manager for Vert.x
Apache License 2.0
77 stars 76 forks source link

Vertx eventbus not working even after cluster formation #193

Open harisht87 opened 4 months ago

harisht87 commented 4 months ago

Questions

We have two services running in OpenShift/K8s platform developed using Vertx framework, the cluster.xml is configure to use Discovery SPI using DNS_LOOKUP and the cluster is getting formed correctly(logs below). Vertx eventbus is set as in below code

But Sending message from one service to other service is not working and getting the "not connected to server" error from vert.x connection holder

Service 1 : vertx.eventBus().request("child",s,res-> { /* response handler goes here */ });
Service 2 : vertx.eventBus().consumer("child", (message)  -> { /** message handler goes here **/ });

Version

Hazelcast version : 5.3.6 vert.x version : 4.5.7 vert.x-hazecast : 4.5.7 K8s version : v1.25.11+c43ddea Red Hat Enterprise Linux 8.9 (Ootpa) java - jdk 8

Context

Sending message from one service to other service is not working getting below error.

03:45:49.772 [vert.x-eventloop-thread-2] DEBUG io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=a6eaf1fa-a3d2-4822-a90e-3102f9a6e215 Not connected to server ae5dd6c0-a118-4af6-a2ef-e4c708eaf32e - starting queuing
03:45:49.855 [vert.x-eventloop-thread-2] DEBUG io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=a6eaf1fa-a3d2-4822-a90e-3102f9a6e215 Draining the queue for server ae5dd6c0-a118-4af6-a2ef-e4c708eaf32e
03:45:49.857 [vert.x-eventloop-thread-2] DEBUG io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=a6eaf1fa-a3d2-4822-a90e-3102f9a6e215 Cluster connection closed for server ae5dd6c0-a118-4af6-a2ef-e4c708eaf32e

Do you have a reproducer?

No

Attaching the git url for th previous issue raised under vert.x 5194

Steps to reproduce

  1. Hazelcast service discovery initialized
    May 29, 2024 3:23:40 AM com.hazelcast.spi.discovery.integration.DiscoveryService
    INFO: [192.168.168.24]:5701 [demo] [5.3.6] Kubernetes Discovery properties: { service-dns: demo-cfg-hsvc.demo-v2.svc.cluster.local, service-dns-timeout: 60, service-name: null, service-port: 0, service-label: null, service-label-value: true, namespace: demo-v2, pod-label: null, pod-label-value: null, resolve-not-ready-addresses: true, expose-externally-mode: AUTO, use-node-name-as-external-address: false, service-per-pod-label: null, service-per-pod-label-value: null, kubernetes-api-retries: 3, kubernetes-master: https://kubernetes.default.svc}
    May 29, 2024 3:23:40 AM com.hazelcast.spi.discovery.integration.DiscoveryService
    INFO: [192.168.168.24]:5701 [demo] [5.3.6] Kubernetes Discovery activated with mode: DNS_LOOKUP
  2. Service A starts
  3. Service B starts
  4. Service A and B forms a cluster
    INFO: [192.168.168.24]:5701 [demo] [5.3.6] Initialized new cluster connection between /192.168.168.24:5701 and /192.168.96.7:54149
    May 29, 2024 3:34:08 AM com.hazelcast.internal.cluster.ClusterService
    INFO: [192.168.168.24]:5701 [demo] [5.3.6]
    Members {size:2, ver:4} [
    Member [192.168.168.24]:5701 - baee1e7b-58d7-49f8-9b98-7becdcde0ca8 this
    Member [192.168.96.7]:5701 - c0257121-f67b-45c7-bc21-5d9ae90f044b lite
    ]
  5. A http call from Service A to Service B via event bus not working
    03:45:49.772 [vert.x-eventloop-thread-2] DEBUG io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=a6eaf1fa-a3d2-4822-a90e-3102f9a6e215 Not connected to server ae5dd6c0-a118-4af6-a2ef-e4c708eaf32e - starting queuing
    03:45:49.855 [vert.x-eventloop-thread-2] DEBUG io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=a6eaf1fa-a3d2-4822-a90e-3102f9a6e215 Draining the queue for server ae5dd6c0-a118-4af6-a2ef-e4c708eaf32e
    03:45:49.857 [vert.x-eventloop-thread-2] DEBUG io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=a6eaf1fa-a3d2-4822-a90e-3102f9a6e215 Cluster connection closed for server ae5dd6c0-a118-4af6-a2ef-e4c708eaf32e

Extra

Application hazelcast cluster.xml file ————————————————

<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-4.0.xsd"
           xmlns="http://www.hazelcast.com/schema/config"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <lite-member enabled="true"/>
    <properties>
        <property name="hazelcast.discovery.enabled">true</property>
        <property name="hazelcast.rest.enabled">true</property>
        <property name="hazelcast.partial.member.disconnection.resolution.heartbeat.count">5</property>
        <property name="hazelcast.partial.member.disconnection.resolution.algorithm.timeout.seconds">10</property>
    </properties>
    <cluster-name>demo</cluster-name>
    <network>
        <join>
            <multicast enabled="false"/>

            <discovery-strategies>
                <discovery-strategy enabled="true"
                                    class="com.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy"  >
                    <properties>
                        <property name="service-dns">demo-cfg-hsvc.demo-v2.svc.cluster.local</property>
                        <property name="service-dns-timeout">60</property>
                    </properties>
                </discovery-strategy>
            </discovery-strategies>
        </join>
    </network>
</hazelcast>

Hazelcast cluster manager set as below and verticles are deployed:

HazelcastClusterManager mgr = new HazelcastClusterManager();
options.setClusterManager(mgr);
        //options.getEventBusOptions().setHost(host);
        Vertx.clusteredVertx(options, vertx -> {
                    vertx.result().deployVerticle(verticleName, deploymentOptions);
                    vertx.result().exceptionHandler(event -> logger.error("Vertx exception ", event));
                    if (isMdc)
                        setInterceptor(vertx.result());
                }
        );
tsegismont commented 3 months ago

Can you add the configuration for the required Vert.x distributed data structure?

<multimap name="__vertx.subs">
 <backup-count>1</backup-count>
 <value-collection-type>SET</value-collection-type>
</multimap>

<map name="__vertx.haInfo">
 <backup-count>1</backup-count>
</map>

<map name="__vertx.nodeInfo">
 <backup-count>1</backup-count>
</map>

<cp-subsystem>
 <cp-member-count>0</cp-member-count>
 <semaphores>
   <semaphore>
     <name>__vertx.*</name>
     <jdk-compatible>false</jdk-compatible>
     <initial-permits>1</initial-permits>
   </semaphore>
 </semaphores>
</cp-subsystem>

See https://vertx.io/docs/vertx-hazelcast/java/#_using_an_existing_hazelcast_cluster