Open KURANADO2 opened 10 months ago
Honestly I think that running maxwell on kubernetes PLUS raft feels overkill -- k8s already has a great mechanism for running one and only one copy of a service like maxwell at time, plus has support for restarting it when it's down, or when a node dies, etc.
If you decide you really really need to do that i think you have to configure jgroups-raft to talk via TCP instead of its normal multicast thing but it's out of my expertise; I'd check over there.
I am having same issue
2024-03-21 13:00:53 INFO Maxwell - Starting Maxwell. maxMemory: 8068464640 bufferMemoryUsage: 0.25
2024-03-21 13:00:53 DEBUG Configurator - set property TCP.bind_addr to default value /10.244.0.88
2024-03-21 13:00:53 DEBUG Configurator - set property TCP.diagnostics_addr to default value /224.0.75.75
2024-03-21 13:00:53 DEBUG TCP - thread pool min/max/keep-alive: 0/100/30000, internal pool: 0/4/30000 (1 cores available)
2024-03-21 13:00:53 DEBUG NAKACK2 - JGRP000037: use_mcast_xmit should not be used because the transport (TCP) does not support IP multicasting; setting use_mcast_xmit to false
2024-03-21 13:00:53 INFO JChannel - local_addr: maxwell-0, name: maxwell-0-61765
2024-03-21 13:00:53 DEBUG LevelDBLog - Initializing log with empty Metadata
-------------------------------------------------------------------
GMS: address=maxwell-0, cluster=maxwell-0, physical address=10.244.0.88:7500
-------------------------------------------------------------------
2024-03-21 13:00:55 INFO GMS - maxwell-0: no members discovered after 2002 ms: creating cluster as coordinator
2024-03-21 13:00:55 DEBUG NAKACK2 -
[maxwell-0 setDigest()]
existing digest: ]
new digest: maxwell-0: [0 (0)]
resulting digest: maxwell-0: [0 (0)]
2024-03-21 13:00:55 DEBUG GMS - maxwell-0: installing view [maxwell-0|0] (1) [maxwell-0] (maxwell-0 joined)
2024-03-21 13:00:55 DEBUG STABLE - resuming message garbage collection
2024-03-21 13:00:55 DEBUG GMS - maxwell-0: created cluster (first member). My view is [maxwell-0|0], impl is CoordGmsImpl
2024-03-21 13:00:55 INFO MaxwellHA - enter HA group, current leader: null
2024-03-21 13:00:57 INFO MaxwellHA - lost HA election, current leader: null
2024-03-21 13:01:35 WARN TCP - JGRP000012: discarded message from different cluster maxwell-1 (our cluster is maxwell-0). Sender was maxwell-1
2024-03-21 13:01:36 WARN TCP - JGRP000012: discarded message from different cluster maxwell-2 (our cluster is maxwell-0). Sender was maxwell-2
2024-03-21 13:02:38 WARN TCP - JGRP000012: discarded message from different cluster maxwell-1 (our cluster is maxwell-0). Sender was maxwell-1 (received 7 identical messages from maxwell-1 in the last 63314 ms)
each pod is creating its own cluster
its working in k8s
?xml version='1.0' encoding='utf-8'?>
<config xmlns="urn:org:jgroups"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
<TCP bind_port="7500" />
<TCPPING async_discovery="true"
initial_hosts="${jgroups.tcpping.initial_hosts:maxwell-headless[7500]}"
port_range="3"/>
<PING />
<MERGE3 />
<FD_SOCK/>
<FD_ALL/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK2 xmit_interval="500"/>
<UNICAST3 xmit_interval="500"/>
<pbcast.STABLE desired_avg_gossip="50000" max_bytes="4M"/>
<raft.NO_DUPES/>
<pbcast.GMS print_local_addr="true" join_timeout="2000"/>
<UFC max_credits="2M" min_threshold="0.4"/>
<MFC max_credits="2M" min_threshold="0.4"/>
<FRAG2 frag_size="60K"/>
<raft.ELECTION election_min_interval="500" election_max_interval="1000" heartbeat_interval="250"/>
<raft.RAFT members="maxwell-0,maxwell-1,maxwell-2" raft_id="${raft_id:undefined}"/>
<raft.REDIRECT/>
</config>
Also, faced an issue with TCPPING
https://issues.redhat.com/browse/AS7-4828
https://bugzilla.redhat.com/show_bug.cgi?id=900707
2024-03-21 13:07:16 INFO Maxwell - Starting Maxwell. maxMemory: 8068464640 bufferMemoryUsage: 0.25
java.lang.Exception: Property assignment of initial_hosts in TCPPING with original property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] and converted to null could not be assigned
at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:818)
at org.jgroups.stack.Configurator.initializeAttrs(Configurator.java:212)
at org.jgroups.stack.Configurator.createProtocolsAndInitializeAttrs(Configurator.java:126)
at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:65)
at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:49)
at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:490)
at org.jgroups.JChannel.init(JChannel.java:922)
at org.jgroups.JChannel.<init>(JChannel.java:123)
at org.jgroups.JChannel.<init>(JChannel.java:105)
at com.zendesk.maxwell.MaxwellHA.startHA(MaxwellHA.java:57)
at com.zendesk.maxwell.Maxwell.main(Maxwell.java:335)
Caused by: java.lang.Exception: Conversion of initial_hosts in TCPPING with property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] failed
at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:85)
at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812)
... 10 more
Caused by: java.net.UnknownHostException: maxwell-1.maxwell: Name or service not known
at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929)
at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1529)
at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1519)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1378)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
at org.jgroups.util.Util.parseCommaDelimitedHosts(Util.java:3583)
at org.jgroups.conf.PropertyConverters$InitialHosts.convert(PropertyConverters.java:49)
at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:82)
... 11 more
2024-03-21 13:07:16 ERROR Maxwell - Maxwell saw an exception and is exiting...
java.lang.Exception: Property assignment of initial_hosts in TCPPING with original property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] and converted to null could not be assigned
at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:818) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.stack.Configurator.initializeAttrs(Configurator.java:212) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.stack.Configurator.createProtocolsAndInitializeAttrs(Configurator.java:126) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:65) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:49) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:490) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.JChannel.init(JChannel.java:922) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.JChannel.<init>(JChannel.java:123) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.JChannel.<init>(JChannel.java:105) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at com.zendesk.maxwell.MaxwellHA.startHA(MaxwellHA.java:57) ~[maxwell-1.41.0.jar:1.41.0]
at com.zendesk.maxwell.Maxwell.main(Maxwell.java:335) [maxwell-1.41.0.jar:1.41.0]
Caused by: java.lang.Exception: Conversion of initial_hosts in TCPPING with property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] failed
at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:85) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
... 10 more
Caused by: java.net.UnknownHostException: maxwell-1.maxwell: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) ~[?:?]
at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929) ~[?:?]
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1529) ~[?:?]
at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1519) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1378) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.jgroups.util.Util.parseCommaDelimitedHosts(Util.java:3583) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.conf.PropertyConverters$InitialHosts.convert(PropertyConverters.java:49) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:82) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
... 10 more
2024-03-21 13:07:16 INFO TaskManager - Stopping 0 tasks
2024-03-21 13:07:16 INFO TaskManager - Stopped all tasks
2024-03-21 13:07:16 DEBUG MaxwellContext - Shutdown complete: true
Stream closed EOF for maxwell/maxwell-0 (maxwell)
in case I have RAFT configured with port enabled:
<?xml version='1.0' encoding='utf-8'?>
<config xmlns="urn:org:groups"
...
<TCP bind_port="7500" />
<TCPPING async_discovery="true"
initial_hosts="${jgroups.tcpping.initial_hosts:maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500]}"
port_range="3"/>
...
</config>
k8s config
spec:
containers:
- name: maxwell
image: zendesk/maxwell:v1.41.0
imagePullPolicy: IfNotPresent
command:
- bin/maxwell
args:
- "--env_config_prefix=MW_"
- "--ha"
- "--raft_member_id=$(POD_NAME)"
- "--client_id=$(POD_NAME)"
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
guys,
you really don't need k8s + raft. There's really no need; let k8s run "1 and exactly 1" copy of maxwell ; if one dies k8s will replace it.
I want running maxwell on kubernets.
Below is my
raft.xml
file:Below is my kubernetes deploy yaml file:
maxwell.yaml
Content of the
maxwell.sh
script in the yaml file:When i execute command:
kubectl apply -f maxwell.yaml
, The logs for the three Pods are as follows:What should I do to get the three nodes to start an election?