Open adnklauser opened 6 years ago
Hi @adnklauser
Yes - you're right. I've just checked the latest version of our charts we have locally, and we do indeed have those flags you mention set. The other main difference between our charts and the ones here is in the shared configmap, where it sets:
<address>jms</address>
Which will only distribute messages starting with 'jms' - which for us didn't include everything. The default in these charts should probably be blank as a generic case (to include everything).
I'll work on a PR for these charts. Hope this helps!
We face the same issue when tried to run above chart without any persistence for live (master) node
Even with :
Are there any updates regarding this issue? I want to use this helm chart in k8s production, But the aforementioned issue still exists, and as a workaround am deleting the slave pod when master restarts. I also tried adding the <check-for-live-server>true</check-for-live-server> and <allow-failback>true</allow-failback>
on respective master and slave configmap file but still it doesn't work. Can we expect any upgraded helm chart with proper failover and failback?
@chandras-xl The issue is not with chart but with Artemis cluster configuration itself. So If you don’t have any kind persistent storage inside your k8s, move your cluster formation on aretmis to virtual machine I.e as docker image (docker-compose) or run as daemon
@andrusstrockiy Thank you! the failover and failback worked after using persistent storage on my k8s cluster.
Scenario Restart one master node (
kubectl delete pod xxx
) to simulate a service interruption.Expected behaviour Slave becomes active immediately and when the master is back up (restarted by k8s) and synchronized, there is still only one active ActiveMQ artemis instance for that master/slave pair.
Actual behaviour Slave becomes active immediately (✔️), but after k8s restarts the master pod, it, too, is considered active (❌), at least from the perspective of k8s (1/1 pods). The consequence of this is that k8s would route requests to both master and slave (via the service DNS)
Additional information I haven't really tested much beyond this observation. I don't know if the master node would have actually responded to requests. But I find it a bit weird that the system doesn't return to the original state after a failover.
The Artemis HA documentation suggests to use
<allow-failback>true</allow-failback>
on the slave and<check-for-live-server>true</check-for-live-server>
on the master. I must confess, I don't understand why the chart explicitly configures the opposite, but my experience with Artemis is very limited so far.