quarkiverse / quarkus-artemis

Quarkus Artemis extensions
Apache License 2.0
12 stars 12 forks source link

unable to connect to cluster if first node is down #567

Closed vsevel closed 2 months ago

vsevel commented 2 months ago

problem description

I have created a quarkus app connecting to a hornetq cluster, and sending a message every 5 seconds. if both hornetq nodes are up, and I start the application, I can send my messages OK. if I shut down node x1, and I start the application, no messages get sent at all, and the following exception gets logged every time the scheduler gets called, and no messages get sent:

2024-07-23 15:50:37,869 INFO  [io.qua.iro.runtime] (jca-worker-pool-<default>-1) QIJ000001: Starting Resource Adapter <default>: ActiveMQ Artemis 2.33.0

2024-07-23 15:50:37,884 INFO  [org.apa.act.art.ra.ActiveMQRALogger] (jca-worker-pool-<default>-1) AMQ151007: Resource adaptor started
2024-07-23 15:50:38,041 INFO  [io.quarkus] (Quarkus Main Thread) jmsraavail 1.0.0-SNAPSHOT on JVM (powered by Quarkus 3.12.3) started in 3.446s. Listening on: http://localhost:18080                                              
2024-07-23 15:50:38,042 INFO  [io.quarkus] (Quarkus Main Thread) Profile dev activated. Live Coding activated.
2024-07-23 15:50:38,043 INFO  [io.quarkus] (Quarkus Main Thread) Installed features: [artemis-core, artemis-jms-ra, cdi, ironjacamar, narayana-jta, quartz, rest, scheduler, smallrye-context-propagation, vertx]    
...              
2024-07-23 15:50:40,145 WARN  [org.jbo.jca.cor.con.poo.str.PoolByCri] (vert.x-worker-thread-1) IJ000604: Throwable while attempting to get a new connection: null: jakarta.resource.ResourceException: Failed to create session factory
        at org.apache.activemq.artemis.ra.ActiveMQRAManagedConnection.setup(ActiveMQRAManagedConnection.java:725)
        at org.apache.activemq.artemis.ra.ActiveMQRAManagedConnection.<init>(ActiveMQRAManagedConnection.java:161)
        at org.apache.activemq.artemis.ra.ActiveMQRAManagedConnectionFactory.createManagedConnection(ActiveMQRAManagedConnectionFactory.java:144)
        at org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreArrayListManagedConnectionPool.createConnectionEventListener(SemaphoreArrayListManagedConnectionPool.java:1267)
        at org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreArrayListManagedConnectionPool.getConnection(SemaphoreArrayListManagedConnectionPool.java:495)
        at org.jboss.jca.core.connectionmanager.pool.AbstractPool.getTransactionNewConnection(AbstractPool.java:770)
        at org.jboss.jca.core.connectionmanager.pool.AbstractPool.getConnection(AbstractPool.java:666)
        at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.getManagedConnection(AbstractConnectionManager.java:624)
        at org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.getManagedConnection(TxConnectionManagerImpl.java:440)
        at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:789)
        at org.apache.activemq.artemis.ra.ActiveMQRASessionFactoryImpl.allocateConnection(ActiveMQRASessionFactoryImpl.java:792)
        at org.apache.activemq.artemis.ra.ActiveMQRASessionFactoryImpl.createSession(ActiveMQRASessionFactoryImpl.java:482)
        at org.apache.activemq.artemis.ra.ActiveMQRASessionFactoryImpl.createSession(ActiveMQRASessionFactoryImpl.java:673)
        at org.apache.activemq.artemis.ra.ActiveMQRASessionFactoryImpl.createSession(ActiveMQRASessionFactoryImpl.java:678)
        at org.apache.activemq.artemis.ra.ActiveMQRAConnectionFactoryImpl.validateUser(ActiveMQRAConnectionFactoryImpl.java:422)
        at org.apache.activemq.artemis.ra.ActiveMQRAConnectionFactoryImpl.createContext(ActiveMQRAConnectionFactoryImpl.java:379)
        at org.apache.activemq.artemis.ra.ActiveMQRAConnectionFactoryImpl.createContext(ActiveMQRAConnectionFactoryImpl.java:369)
        at org.apache.activemq.artemis.ra.ActiveMQRAConnectionFactoryImpl.createContext(ActiveMQRAConnectionFactoryImpl.java:364)
        at org.acme.MyProducer.send(MyProducer.java:40)
        at org.acme.MyProducer_Subclass.send$$superforward(Unknown Source)
        at org.acme.MyProducer_Subclass$$function$$1.apply(Unknown Source)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:136)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:107)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.doIntercept(TransactionalInterceptorRequired.java:38)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.intercept(TransactionalInterceptorBase.java:61)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.intercept(TransactionalInterceptorRequired.java:32)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
        at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
        at org.acme.MyProducer_Subclass.send(Unknown Source)
        at org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory.createConnectionInternal(ActiveMQConnectionFactory.java:915)
        at org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory.createXAConnection(ActiveMQConnectionFactory.java:387)
        at org.apache.activemq.artemis.ra.ActiveMQRAManagedConnection.setup(ActiveMQRAManagedConnection.java:713)
        ... 49 more
Caused by: ActiveMQNotConnectedException[errorType=NOT_CONNECTED message=AMQ219007: Cannot connect to server(s). Tried with all available servers.]
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:735)
        at org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory.createConnectionInternal(ActiveMQConnectionFactory.java:913)
        ... 51 more

The application has the following configuration:

quarkus.http.port=18080
quarkus.ironjacamar.ra.kind=artemis
quarkus.ironjacamar.ra.config.connection-parameters=host=x1;port=...,host=x2;port=...;protocols=CORE
quarkus.ironjacamar.ra.config.protocol-manager-factory=org.apache.activemq.artemis.core.protocol.hornetq.client.HornetQClientProtocolManagerFactory
quarkus.ironjacamar.ra.config.user=...
quarkus.ironjacamar.ra.config.password=...
quarkus.ironjacamar.activation-spec.myqueue.config.destination-type=jakarta.jms.Queue
quarkus.ironjacamar.activation-spec.myqueue.config.destination=jms.queue....
quarkus.ironjacamar.activation-spec.myqueue.config.max-session=10
quarkus.ironjacamar.activation-spec.myqueue.config.rebalance-connections=true

if I exchange x1 and x2 then it works:

quarkus.ironjacamar.ra.config.connection-parameters=host=x2;port=...,host=x1;port=...;protocols=CORE

this only fails when the failed node is first in the url.

expectation

the client should be able to connect to any running node.

producer code

@ApplicationScoped
public class MyProducer {

    private static final Logger log = LoggerFactory.getLogger(MyProducer.class);

    public static final String QUEUE = "jms.queue.com.lodh.jee.ArteTest.Bank.jms.bankref1.QueueB";

    private final Queue queue = ActiveMQDestination.createQueue(QUEUE);;

    @Inject
    ConnectionFactory factory;

    @Scheduled(every = "5s")
    public void send() throws UnknownHostException {
        try (JMSContext context = factory.createContext()) {
            String text = new Date() + " from host=" + InetAddress.getLocalHost().getHostName();
            log.info("sending jms message {}", text);
            JMSProducer producer = context.createProducer();
            TextMessage message = context.createTextMessage(text);
            producer.send(queue, message);
        }
    }
}

pom

<quarkus.platform.version>3.12.3</quarkus.platform.version>
...
    <dependency>
      <groupId>io.quarkiverse.artemis</groupId>
      <artifactId>quarkus-artemis-jms-ra</artifactId>
      <version>3.3.0</version>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-scheduler</artifactId>
    </dependency>
    <dependency>
      <groupId>org.apache.activemq</groupId>
      <artifactId>artemis-hqclient-protocol</artifactId>
      <version>2.33.0</version>
    </dependency>

cc @gastaldi @zhfeng

gastaldi commented 2 months ago

Is this a known issue/feature in the artemis-ra @clebertsuconic?

vsevel commented 2 months ago

Never seen that in our eap clusters

clebertsuconic commented 2 months ago

Is this actually hornetq. Clustering is not actually related RA

turing85 commented 2 months ago

Is this actually hornetq. Clustering is not actually related RA

I have problems to categorize the issue and your statement. Does this mean this isn't on "our" (quarkus-artemis) end?

@zhfeng are you able to determine whether this is something on our end?

gastaldi commented 2 months ago

Is this actually hornetq. Clustering is not actually related RA

@clebertsuconic apologies, I thought it was because of the configuration that is passed to the RA:

quarkus.ironjacamar.ra.config.connection-parameters=host=x1;port=...,host=x2;port=...;protocols=CORE

I haven't checked the sources, but looking at the stacktrace it seems to originate from the Artemis Client used internally while creating the SessionFactory

zhfeng commented 2 months ago

@turing85 I guess it could be related to quarkus-ironjacamar or artemis-ra.

@vsevel Is it possible to enable the TRACE log? it could be helpful to check if all of the hosts have been tried to connect?

vsevel commented 2 months ago

when I look at the ActiveMQResourceAdapter instance, I can see the unparsedProperties=host=x2;port=...,host=x2;port=...;protocols=CORE but the defaultActiveMQConnectionFactory is only: ActiveMQConnectionFactory [serverLocator=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=b5ca45d6-4a86-11ef-a31c-00155d61c210, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)?port=...&host=x1], discoveryGroupConfiguration=null], clientID=null, consumerWindowSize=1048576, dupsOKBatchSize=1048576, transactionBatchSize=1048576, readOnly=false, EnableSharedClientID=true]

only the first host is in the initial connectors.

vsevel commented 2 months ago

is quarkus.ironjacamar.ra.config.connection-parameters=host=x1;port=...,host=x2;port=...;protocols=CORE the correct syntax?

vsevel commented 2 months ago

in ArtemisResourceAdapterFactory we have:

    public ActiveMQResourceAdapter createResourceAdapter(String id, Map<String, String> config) {
        ActiveMQResourceAdapter adapter = new ActiveMQResourceAdapter();
        adapter.setConnectorClassName(NettyConnectorFactory.class.getName());
        adapter.setConnectionParameters((String)config.get("connection-parameters"));

but later in ActiveMQResourceAdapter we build one transport configuration per connector class name:

      } else if (connectorClassName != null) {
         TransportConfiguration[] transportConfigurations = new TransportConfiguration[connectorClassName.size()];
...
         for (int i = 0; i < connectorClassName.size(); i++) {
            TransportConfiguration tc;
            if (connectionParams == null || i >= connectionParams.size()) {
               tc = new TransportConfiguration(connectorClassName.get(i));
               logger.debug("No connector params provided using default");
            } else {
               tc = new TransportConfiguration(connectorClassName.get(i), connectionParams.get(i));
            }

            transportConfigurations[i] = tc;
         }

instead of adapter.setConnectorClassName(NettyConnectorFactory.class.getName());, we should be passing org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory,org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory for both hosts to be recognized?

gastaldi commented 2 months ago

@vsevel that looks weird, but if that's how the adapter works, maybe it's acceptable. Can you test with a custom ResourceAdapterFactory and let us know how that works?

vsevel commented 2 months ago

I confirm that if I intercept the code in:

   public void setConnectorClassName(final String connectorClassName) {
      logger.trace("setTransportType({})", connectorClassName);

and change the connectorClassName to "org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory,org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory", then it works as expected.

you either need to pass as many NettyConnectorFactory as you see there are connection parameters, or make the connector class name a configurable variable.

gastaldi commented 2 months ago

@vsevel since you found the root cause, do you want to submit a PR with the fix?

gastaldi commented 2 months ago

I went ahead and created #572, see if that works for you

vsevel commented 2 months ago

thanks @gastaldi I will have a look early next week

turing85 commented 2 months ago

@all-contributors please add @vsevel for bug

allcontributors[bot] commented 2 months ago

@turing85

I've put up a pull request to add @vsevel! :tada:

vsevel commented 2 months ago

thanks @gastaldi

turing85 commented 2 months ago

@vsevel is it okay if we hold back the release until quarkus 3.13.0 reaches GA (and camel-quarkus version 3.13.0 lands)?

gastaldi commented 2 months ago

thanks @gastaldi

Thank you for doing the hardest part of investigating 😉

vsevel commented 2 months ago

is it okay if we hold back the release until quarkus 3.13.0 reaches GA (and camel-quarkus version 3.13.0 lands)?

yes @turing85

turing85 commented 2 months ago

The fix has been released with versions 3.2.2 and 3.4.0.