zendesk / maxwell

Maxwell's daemon, a mysql-to-json kafka producer
https://maxwells-daemon.io/
Other
3.95k stars 996 forks source link

Running maxwell on kubernetes #2036

Open KURANADO2 opened 10 months ago

KURANADO2 commented 10 months ago

I want running maxwell on kubernets.

Below is my raft.xml file:

<?xml version='1.0' encoding='utf-8'?>
<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
    <UDP mcast_addr="228.8.8.8" mcast_port="${jgroups.udp.mcast_port:45588}"/>
    <PING />
    <MERGE3 />
    <FD_SOCK/>
    <FD_ALL/>
    <VERIFY_SUSPECT timeout="1500"/>
    <pbcast.NAKACK2 xmit_interval="500"/>
    <UNICAST3 xmit_interval="500"/>
    <pbcast.STABLE desired_avg_gossip="50000" max_bytes="4M"/>
    <raft.NO_DUPES/>
    <pbcast.GMS print_local_addr="true" join_timeout="2000"/>
    <UFC max_credits="2M" min_threshold="0.4"/>
    <MFC max_credits="2M" min_threshold="0.4"/>
    <FRAG2 frag_size="60K"/>
    <raft.ELECTION election_min_interval="500" election_max_interval="1000" heartbeat_interval="250"/>
    <raft.RAFT members="A,B,C" raft_id="${raft_id:undefined}"/>
    <raft.REDIRECT/>
</config>

Below is my kubernetes deploy yaml file: maxwell.yaml

---
apiVersion: v1
kind: Service
metadata:
  name: maxwell
  namespace: abup
#  annotations:
#    [nginx.ingress.kubernetes.io/affinity](http://nginx.ingress.kubernetes.io/affinity): "true"
#    [nginx.ingress.kubernetes.io/session-cookie-name](http://nginx.ingress.kubernetes.io/session-cookie-name): backend
#    [nginx.ingress.kubernetes.io/load-balancer-method](http://nginx.ingress.kubernetes.io/load-balancer-method): drr
spec:
  type: NodePort
  sessionAffinity: ClientIP
  selector:
    app: maxwell
  ports:
  - name: web
    port: 7800
    targetPort: 7800
    nodePort: 30111
---
apiVersion: v1
kind: Service
metadata:
  name: maxwell-headless
  namespace: abup
  labels:
    app: maxwell

spec:
  ports:
    - port: 7800
      name: server
      targetPort: 7800
  clusterIP: None
  selector:
    app: maxwell

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
#  labels:
#    k8s-app: maxwell
#    qcloud-app: maxwell
  name: maxwell
  namespace: abup
spec:
  serviceName: maxwell-headless
  replicas: 3
  selector:
    matchLabels:
      k8s-app: maxwell
      qcloud-app: maxwell
#  strategy:
#    rollingUpdate:
#      maxSurge: 1
#      maxUnavailable: 0
#    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: maxwell
        qcloud-app: maxwell
      annotations:
        [pod.alpha.kubernetes.io/initialized](http://pod.alpha.kubernetes.io/initialized): "true"
    spec:
      affinity:
       nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution: 
          nodeSelectorTerms:
          - matchExpressions:
            - key: [kubernetes.io/hostname](http://kubernetes.io/hostname)
              operator: In
              values:
              - [10.19.10.13](http://10.19.10.13/)
              - [10.19.10.2](http://10.19.10.2/)
              - [10.19.10.7](http://10.19.10.7/)

      containers:
      - env:
        - name: JAVA_OPTS
          value: -Djava.net.preferIPv4Stack=true
        - name: jgroup.tcpping.initial_hosts
          value: [10.19.10.13](http://10.19.10.13/)[7800],[10.19.10.2](http://10.19.10.2/)[7800],[10.19.10.7](http://10.19.10.7/)[7800]
        image: [abup-registry-test.tencentcloudcr.com/abup-qa-gq-ota/maxwell:v1.37.521](http://abup-registry-test.tencentcloudcr.com/abup-qa-gq-ota/maxwell:v1.37.521)
        imagePullPolicy: Always
        name: maxwell
        command: ["/bin/bash","-c"]
        args: ["/app/maxwell.sh"]
        resources:
          limits:
            cpu: 1
            memory: 2Gi
          requests:
            cpu: 1
            memory: 2Gi
        volumeMounts:
        - mountPath: /etc/maxwell/config.properties
          subPath: config.properties
          name: maxwell-conf
      #  volumeMounts:
      #  - mountPath: /app/raft.xml
      #    name: raft-xml
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: docker-login
      restartPolicy: Always
      volumes:
      - name: maxwell-conf
        configMap:
          name: maxwell-conf
          defaultMode: 0777
    #  volumes:
    #  - name: raft-xml
    #    hostPath:
    #      path: /root/raft.xml

Content of the maxwell.sh script in the yaml file:

#!/bin/bash
hostname=$(cat /etc/hostname)
if [ $hostname = maxwell-0 ]
   then
     ./bin/maxwell --producer_partition_by=table --config /etc/maxwell/config.properties --ha --raft_member_id=A
   elif [ $hostname = maxwell-1 ]
        then
          ./bin/maxwell --producer_partition_by=table --config /etc/maxwell/config.properties --ha --raft_member_id=B
   elif [ $hostname = maxwell-2 ]
        then
          ./bin/maxwell --producer_partition_by=table --config /etc/maxwell/config.properties --ha --raft_member_id=C
   else
        echo "No Matched" > test.txt
fi

When i execute command: kubectl apply -f maxwell.yaml, The logs for the three Pods are as follows:

image

What should I do to get the three nodes to start an election?

osheroff commented 10 months ago

Honestly I think that running maxwell on kubernetes PLUS raft feels overkill -- k8s already has a great mechanism for running one and only one copy of a service like maxwell at time, plus has support for restarting it when it's down, or when a node dies, etc.

If you decide you really really need to do that i think you have to configure jgroups-raft to talk via TCP instead of its normal multicast thing but it's out of my expertise; I'd check over there.

yalattas commented 3 months ago

I am having same issue

2024-03-21 13:00:53 INFO  Maxwell - Starting Maxwell. maxMemory: 8068464640 bufferMemoryUsage: 0.25
2024-03-21 13:00:53 DEBUG Configurator - set property TCP.bind_addr to default value /10.244.0.88
2024-03-21 13:00:53 DEBUG Configurator - set property TCP.diagnostics_addr to default value /224.0.75.75
2024-03-21 13:00:53 DEBUG TCP - thread pool min/max/keep-alive: 0/100/30000, internal pool: 0/4/30000 (1 cores available)
2024-03-21 13:00:53 DEBUG NAKACK2 - JGRP000037: use_mcast_xmit should not be used because the transport (TCP) does not support IP multicasting; setting use_mcast_xmit to false
2024-03-21 13:00:53 INFO  JChannel - local_addr: maxwell-0, name: maxwell-0-61765
2024-03-21 13:00:53 DEBUG LevelDBLog - Initializing log with empty Metadata

-------------------------------------------------------------------
GMS: address=maxwell-0, cluster=maxwell-0, physical address=10.244.0.88:7500
-------------------------------------------------------------------
2024-03-21 13:00:55 INFO  GMS - maxwell-0: no members discovered after 2002 ms: creating cluster as coordinator
2024-03-21 13:00:55 DEBUG NAKACK2 - 
[maxwell-0 setDigest()]
existing digest:  ]
new digest:       maxwell-0: [0 (0)]
resulting digest: maxwell-0: [0 (0)]
2024-03-21 13:00:55 DEBUG GMS - maxwell-0: installing view [maxwell-0|0] (1) [maxwell-0] (maxwell-0 joined)
2024-03-21 13:00:55 DEBUG STABLE - resuming message garbage collection
2024-03-21 13:00:55 DEBUG GMS - maxwell-0: created cluster (first member). My view is [maxwell-0|0], impl is CoordGmsImpl
2024-03-21 13:00:55 INFO  MaxwellHA - enter HA group, current leader: null
2024-03-21 13:00:57 INFO  MaxwellHA - lost HA election, current leader: null
2024-03-21 13:01:35 WARN  TCP - JGRP000012: discarded message from different cluster maxwell-1 (our cluster is maxwell-0). Sender was maxwell-1
2024-03-21 13:01:36 WARN  TCP - JGRP000012: discarded message from different cluster maxwell-2 (our cluster is maxwell-0). Sender was maxwell-2
2024-03-21 13:02:38 WARN  TCP - JGRP000012: discarded message from different cluster maxwell-1 (our cluster is maxwell-0). Sender was maxwell-1 (received 7 identical messages from maxwell-1 in the last 63314 ms)

each pod is creating its own cluster

its working in k8s

?xml version='1.0' encoding='utf-8'?>
      <config xmlns="urn:org:jgroups"
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
          <TCP  bind_port="7500" />
          <TCPPING async_discovery="true"
            initial_hosts="${jgroups.tcpping.initial_hosts:maxwell-headless[7500]}"
            port_range="3"/>
          <PING />
          <MERGE3 />
          <FD_SOCK/>
          <FD_ALL/>
          <VERIFY_SUSPECT timeout="1500"/>
          <pbcast.NAKACK2 xmit_interval="500"/>
          <UNICAST3 xmit_interval="500"/>
          <pbcast.STABLE desired_avg_gossip="50000" max_bytes="4M"/>
          <raft.NO_DUPES/>
          <pbcast.GMS print_local_addr="true" join_timeout="2000"/>
          <UFC max_credits="2M" min_threshold="0.4"/>
          <MFC max_credits="2M" min_threshold="0.4"/>
          <FRAG2 frag_size="60K"/>
          <raft.ELECTION election_min_interval="500" election_max_interval="1000" heartbeat_interval="250"/>
          <raft.RAFT members="maxwell-0,maxwell-1,maxwell-2" raft_id="${raft_id:undefined}"/>
          <raft.REDIRECT/>
      </config>

Also, faced an issue with TCPPING

https://issues.redhat.com/browse/AS7-4828

https://bugzilla.redhat.com/show_bug.cgi?id=900707

2024-03-21 13:07:16 INFO  Maxwell - Starting Maxwell. maxMemory: 8068464640 bufferMemoryUsage: 0.25
java.lang.Exception: Property assignment of initial_hosts in TCPPING with original property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] and converted to null could not be assigned
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:818)
    at org.jgroups.stack.Configurator.initializeAttrs(Configurator.java:212)
    at org.jgroups.stack.Configurator.createProtocolsAndInitializeAttrs(Configurator.java:126)
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:65)
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:49)
    at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:490)
    at org.jgroups.JChannel.init(JChannel.java:922)
    at org.jgroups.JChannel.<init>(JChannel.java:123)
    at org.jgroups.JChannel.<init>(JChannel.java:105)
    at com.zendesk.maxwell.MaxwellHA.startHA(MaxwellHA.java:57)
    at com.zendesk.maxwell.Maxwell.main(Maxwell.java:335)
Caused by: java.lang.Exception: Conversion of initial_hosts in TCPPING with property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] failed
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:85)
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812)
    ... 10 more
Caused by: java.net.UnknownHostException: maxwell-1.maxwell: Name or service not known
    at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929)
    at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1529)
    at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848)
    at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1519)
    at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1378)
    at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
    at org.jgroups.util.Util.parseCommaDelimitedHosts(Util.java:3583)
    at org.jgroups.conf.PropertyConverters$InitialHosts.convert(PropertyConverters.java:49)
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:82)
    ... 11 more
2024-03-21 13:07:16 ERROR Maxwell - Maxwell saw an exception and is exiting...
java.lang.Exception: Property assignment of initial_hosts in TCPPING with original property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] and converted to null could not be assigned
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:818) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.initializeAttrs(Configurator.java:212) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.createProtocolsAndInitializeAttrs(Configurator.java:126) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:65) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:49) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:490) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.JChannel.init(JChannel.java:922) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.JChannel.<init>(JChannel.java:123) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.JChannel.<init>(JChannel.java:105) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at com.zendesk.maxwell.MaxwellHA.startHA(MaxwellHA.java:57) ~[maxwell-1.41.0.jar:1.41.0]
    at com.zendesk.maxwell.Maxwell.main(Maxwell.java:335) [maxwell-1.41.0.jar:1.41.0]
Caused by: java.lang.Exception: Conversion of initial_hosts in TCPPING with property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] failed
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:85) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    ... 10 more
Caused by: java.net.UnknownHostException: maxwell-1.maxwell: Name or service not known
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) ~[?:?]
    at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929) ~[?:?]
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1529) ~[?:?]
    at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848) ~[?:?]
    at java.net.InetAddress.getAllByName0(InetAddress.java:1519) ~[?:?]
    at java.net.InetAddress.getAllByName(InetAddress.java:1378) ~[?:?]
    at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
    at org.jgroups.util.Util.parseCommaDelimitedHosts(Util.java:3583) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.conf.PropertyConverters$InitialHosts.convert(PropertyConverters.java:49) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:82) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    ... 10 more
2024-03-21 13:07:16 INFO  TaskManager - Stopping 0 tasks
2024-03-21 13:07:16 INFO  TaskManager - Stopped all tasks
2024-03-21 13:07:16 DEBUG MaxwellContext - Shutdown complete: true
Stream closed EOF for maxwell/maxwell-0 (maxwell)

in case I have RAFT configured with port enabled:

<?xml version='1.0' encoding='utf-8'?>
    <config xmlns="urn:org:groups"
...
        <TCP  bind_port="7500" />
        <TCPPING async_discovery="true"
          initial_hosts="${jgroups.tcpping.initial_hosts:maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500]}"
          port_range="3"/>
...
    </config>

k8s config

spec:
  containers:
  - name: maxwell
    image: zendesk/maxwell:v1.41.0
    imagePullPolicy: IfNotPresent
    command:
    - bin/maxwell
    args:
    - "--env_config_prefix=MW_"
    - "--ha"
    - "--raft_member_id=$(POD_NAME)"
    - "--client_id=$(POD_NAME)"
    env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
osheroff commented 3 months ago

guys,

you really don't need k8s + raft. There's really no need; let k8s run "1 and exactly 1" copy of maxwell ; if one dies k8s will replace it.