apache / incubator-seata

:fire: Seata is an easy-to-use, high-performance, open source distributed transaction solution.
https://seata.apache.org/
Apache License 2.0
25.21k stars 8.76k forks source link

NullPointerException when deploying raft using StatefulSet in k8s(server-addr use cluser dns) #6242

Closed zxuanhong closed 8 months ago

zxuanhong commented 8 months ago

Ⅰ. Issue Description

  1. NullPointerException when deploying raft using StatefulSet in k8s.The address used is the cluster address, not the IP address (the IP will change every time it is deployed, so using which one is meaningless) image
kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: seata-cluster
  namespace: businessbasic
  labels:
    app: seata-cluster
  annotations:
    kubesphere.io/creator: admin
spec:
  replicas: 2
  selector:
    matchLabels:
      app: seata-cluster
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: seata-cluster
      annotations:
        kubesphere.io/creator: admin
        kubesphere.io/imagepullsecrets: '{"container-s6kka2":"harbor"}'
        logging.kubesphere.io/logsidecar-config: '{}'
    spec:
      volumes:
        - name: host-time
          hostPath:
            path: /etc/localtime
            type: ''
        - name: volume-ui72s9
          configMap:
            name: seata-cluster
            items:
              - key: application.yml
                path: application.yml
            defaultMode: 420
      containers:
        - name: container-s6kka2
          image: 'harbor.anyilanxin.com/library/seataio/seata-server:2.0.0-slim'
          ports:
            - name: tcp-7091
              containerPort: 7091
              protocol: TCP
            - name: tcp-8091
              containerPort: 8091
              protocol: TCP
          env:
            - name: seata.server.raft.server-addr
              value: >-
                seata-cluster-0.seata-cluster-0834.businessbasic.svc.cluster.local:9091,seata-cluster-1.seata-cluster-0834.businessbasic.svc.cluster.local:9091,seata-cluster-2.seata-cluster-0834.businessbasic.svc.cluster.local:9091
          resources: {}
          volumeMounts:
            - name: host-time
              mountPath: /etc/localtime
            - name: seata-cluster
              mountPath: /app/seatadata
            - name: volume-ui72s9
              readOnly: true
              mountPath: /seata-server/resources/application.yml
              subPath: application.yml
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: default
      serviceAccount: default
      securityContext: {}
      imagePullSecrets:
        - name: harbor
      schedulerName: default-scheduler
  volumeClaimTemplates:
    - kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: seata-cluster
        namespace: businessbasic
        creationTimestamp: null
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        storageClassName: nfs-csi
        volumeMode: Filesystem
      status:
        phase: Pending
  serviceName: seata-cluster-0834
  podManagementPolicy: OrderedReady
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 0
  revisionHistoryLimit: 10

Ⅱ. Describe what happened

2024-01-05T18:13:38.666032373+08:00 apm-skywalking not enabled

2024-01-05T18:13:38.817254209+08:00 JMX disabled

2024-01-05T18:13:38.819590002+08:00 Affected JVM parameters: -Dlog.home=/root/logs/seata -server -Dloader.path=/lib -Xmx2048m -Xms2048m -Xss640k -XX:SurvivorRatio=10 -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=256m -XX:MaxDirectMemorySize=1024m -XX:-OmitStackTraceInFastThrow -XX:-UseAdaptiveSizePolicy -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/root/logs/seata/java_heapdump.hprof -XX:+DisableExplicitGC -Xloggc:/root/logs/seata/seata_gc.log -verbose:gc -XX:+PrintGCDetails  -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -Dio.netty.leakDetectionLevel=advanced -Dapp.name=seata-server -Dapp.pid=1 -Dapp.home=/ -Dbasedir=/ 

2024-01-05T18:13:38.830342920+08:00 OpenJDK 64-Bit Server VM warning: Cannot open file /root/logs/seata/seata_gc.log due to No such file or directory

2024-01-05T18:13:38.830362404+08:00 

2024-01-05T18:13:40.876877858+08:00 ███████╗███████╗ █████╗ ████████╗ █████╗

2024-01-05T18:13:40.876911310+08:00 ██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔══██╗

2024-01-05T18:13:40.876917316+08:00 ███████╗█████╗  ███████║   ██║   ███████║

2024-01-05T18:13:40.876922903+08:00 ╚════██║██╔══╝  ██╔══██║   ██║   ██╔══██║

2024-01-05T18:13:40.876928001+08:00 ███████║███████╗██║  ██║   ██║   ██║  ██║

2024-01-05T18:13:40.876932401+08:00 ╚══════╝╚══════╝╚═╝  ╚═╝   ╚═╝   ╚═╝  ╚═╝

2024-01-05T18:13:40.876936172+08:00 

2024-01-05T18:13:40.876940572+08:00 

2024-01-05T18:13:41.142911760+08:00 18:13:41.137  INFO --- [                     main] [ta.config.ConfigurationFactory] [                load]  [] : load Configuration from :Spring Configuration

2024-01-05T18:13:41.157431377+08:00 18:13:41.157  INFO --- [                     main] [ta.config.ConfigurationFactory] [  buildConfiguration]  [] : load Configuration from :Spring Configuration

2024-01-05T18:13:41.517719069+08:00 18:13:41.517  INFO --- [                     main] [seata.server.ServerApplication] [         logStarting]  [] : Starting ServerApplication using Java 1.8.0_342 on seata-cluster-0 with PID 1 (/seata-server/classes started by root in /seata-server)

2024-01-05T18:13:41.518560963+08:00 18:13:41.518  INFO --- [                     main] [seata.server.ServerApplication] [ogStartupProfileInfo]  [] : No active profile set, falling back to 1 default profile: "default"

2024-01-05T18:13:43.237626153+08:00 18:13:43.237  INFO --- [                     main] [mbedded.tomcat.TomcatWebServer] [          initialize]  [] : Tomcat initialized with port(s): 7091 (http)

2024-01-05T18:13:43.248701092+08:00 18:13:43.248  INFO --- [                     main] [oyote.http11.Http11NioProtocol] [                 log]  [] : Initializing ProtocolHandler ["http-nio-7091"]

2024-01-05T18:13:43.251317075+08:00 18:13:43.250  INFO --- [                     main] [.catalina.core.StandardService] [                 log]  [] : Starting service [Tomcat]

2024-01-05T18:13:43.251555780+08:00 18:13:43.251  INFO --- [                     main] [e.catalina.core.StandardEngine] [                 log]  [] : Starting Servlet engine: [Apache Tomcat/9.0.82]

2024-01-05T18:13:43.386202535+08:00 18:13:43.385  INFO --- [                     main] [rBase.[Tomcat].[localhost].[/]] [                 log]  [] : Initializing Spring embedded WebApplicationContext

2024-01-05T18:13:43.386251700+08:00 18:13:43.385  INFO --- [                     main] [letWebServerApplicationContext] [ebApplicationContext]  [] : Root WebApplicationContext: initialization completed in 1808 ms

2024-01-05T18:13:44.433781501+08:00 18:13:44.432  INFO --- [                     main] [vlet.WelcomePageHandlerMapping] [              <init>]  [] : Adding welcome page: class path resource [static/index.html]

2024-01-05T18:13:44.839314275+08:00 18:13:44.838  INFO --- [                     main] [oyote.http11.Http11NioProtocol] [                 log]  [] : Starting ProtocolHandler ["http-nio-7091"]

2024-01-05T18:13:44.862359404+08:00 18:13:44.861  INFO --- [                     main] [mbedded.tomcat.TomcatWebServer] [               start]  [] : Tomcat started on port(s): 7091 (http) with context path ''

2024-01-05T18:13:44.874035018+08:00 18:13:44.873  INFO --- [                     main] [seata.server.ServerApplication] [          logStarted]  [] : Started ServerApplication in 5.102 seconds (JVM running for 6.044)

2024-01-05T18:13:45.039037727+08:00 18:13:45.038  INFO --- [                     main] [rver.lock.LockerManagerFactory] [                init]  [] : use lock store mode: raft

2024-01-05T18:13:45.044730132+08:00 18:13:45.044  INFO --- [                     main] [a.server.session.SessionHolder] [                init]  [] : use session store mode: raft

2024-01-05T18:13:45.159261241+08:00 18:13:45.158  INFO --- [                     main] [.jraft.util.JRaftServiceLoader] [         newProvider]  [] : SPI service [com.alipay.sofa.jraft.rpc.RaftRpcFactory - com.alipay.sofa.jraft.rpc.impl.BoltRaftRpcFactory] loading.

2024-01-05T18:13:45.271101781+08:00 Sofa-Middleware-Log SLF4J : Actual binding is of type [ com.alipay.remoting Logback ]

2024-01-05T18:13:45.272041378+08:00 18:13:45.271  INFO --- [                     main] [com.alipay.sofa.common.log    ] [              report]  [] : Sofa-Middleware-Log SLF4J : Actual binding is of type [ com.alipay.remoting Logback ]

2024-01-05T18:13:45.453409306+08:00 18:13:45.453  WARN --- [                     main] [cluster.raft.RaftServerFactory] [                init]  [] : raft mode and raft cluster is an experimental feature

2024-01-05T18:13:45.460925104+08:00 18:13:45.460 ERROR --- [                     main] [io.seata.server.ServerRunner  ] [                 run]  [] : seata server start error: null 

2024-01-05T18:13:45.460950595+08:00 ==>

2024-01-05T18:13:45.460959325+08:00 java.lang.NullPointerException: null

2024-01-05T18:13:45.460968264+08:00     at io.seata.server.cluster.raft.RaftServerFactory.init(RaftServerFactory.java:132)

2024-01-05T18:13:45.460974689+08:00     at io.seata.server.session.SessionHolder.init(SessionHolder.java:102)

2024-01-05T18:13:45.460981254+08:00     at io.seata.server.session.SessionHolder.init(SessionHolder.java:84)

2024-01-05T18:13:45.460986980+08:00     at io.seata.server.Server.start(Server.java:92)

2024-01-05T18:13:45.460993196+08:00     at io.seata.server.ServerRunner.run(ServerRunner.java:60)

2024-01-05T18:13:45.461015055+08:00     at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:765)

2024-01-05T18:13:45.461022737+08:00     at org.springframework.boot.SpringApplication.lambda$callRunners$2(SpringApplication.java:749)

2024-01-05T18:13:45.461028045+08:00     at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)

2024-01-05T18:13:45.461033772+08:00     at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357)

2024-01-05T18:13:45.461038870+08:00     at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483)

2024-01-05T18:13:45.461044247+08:00     at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)

2024-01-05T18:13:45.461049415+08:00     at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)

2024-01-05T18:13:45.461054514+08:00     at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)

2024-01-05T18:13:45.461070087+08:00     at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

2024-01-05T18:13:45.461075954+08:00     at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)

2024-01-05T18:13:45.461081680+08:00     at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:744)

2024-01-05T18:13:45.461087547+08:00     at org.springframework.boot.SpringApplication.run(SpringApplication.java:315)

2024-01-05T18:13:45.461093413+08:00     at org.springframework.boot.SpringApplication.run(SpringApplication.java:1300)

2024-01-05T18:13:45.461099210+08:00     at org.springframework.boot.SpringApplication.run(SpringApplication.java:1289)

2024-01-05T18:13:45.461105495+08:00     at io.seata.server.ServerApplication.main(ServerApplication.java:30)

2024-01-05T18:13:45.461128891+08:00 <==

2024-01-05T18:13:45.461134897+08:00 

Ⅲ. Describe what you expected to happen

Ⅳ. How to reproduce it (as minimally and precisely as possible)

  1. xxx
  2. xxx
  3. xxx

Minimal yet complete reproducer code (or URL to code):

Ⅴ. Anything else we need to know?

Ⅵ. Environment:

funky-eyes commented 8 months ago

请为对应的server节点增加属于他自己的SEATA_IP,比如节点一为seata-cluster-0.seata-cluster-0834.businessbasic.svc.cluster.local:9091,那么SEATA_IP=seata-cluster-0.seata-cluster-0834.businessbasic.svc.cluster.local Please add its own SEATA_IP for the corresponding server node, for example, node one is seata-cluster-0.seata-cluster-0834.businessbasic.svc.cluster.local: 9091, then SEATA_IP = seata-cluster-0.seata-cluster-0834.businessbasic.svc.cluster.local

zxuanhong commented 8 months ago

@funky-eyes Thank you very much. I'll try it later, and if it's OK, I'll close the current issue directly

zxuanhong commented 8 months ago

@funky-eyes Thank you very much. That's all right.It would be perfect if the administration page could have a page like nacos to display cluster information

image image
funky-eyes commented 8 months ago

Good suggestion, we are planning to display the cluster information of seata-raft on the console

zacharias1989 commented 4 months ago

hello,I use simillar configuration with 2.0.0-slim and raft,and the raft cluster run ok. But when I use seata clilent connect the cluster, it said TM and RM register success,then report error “Decode frame error, cause: Adjusted frame length exceeds 8388608: 1411395437 - discarded” all the time. Why?

zacharias1989 commented 4 months ago

Good suggestion, we are planning to display the cluster information of seata-raft on the console

用2.0.0-slim镜像搭建的raft集群,TM和RM register success后server一直报"Decode frame error, cause: Adjusted frame length exceeds 8388608: 1411395437 - discarded",client对应的报错“read timed out”,这个如何解决?