datavane / tis

Support agile DataOps Based on Flink, DataX and Flink-CDC, Chunjun with Web-UI
https://tis.pub
Apache License 2.0
1.04k stars 221 forks source link

sqlserver使用datax同步大量数据至doris时出现内存溢出报错,无法同步 #380

Closed zzxVv closed 1 week ago

zzxVv commented 1 month ago

使用datax同步2000万条数据至doris时,出现如下报错,配置均为系统默认配置。同步几十万条数据可正常同步。

`ERROR c.a.d.c.t.runner.ReaderRunner-Reader runner Received Exceptions:java.lang.OutOfMemoryError: Java heap space
taskid:11,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator-Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
taskid:11,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
taskid:10,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator-Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
taskid:10,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator-Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
taskid:10,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator-Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
taskid:11,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator-Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
taskid:11,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator-Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
taskid:11,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator-Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
taskid:11,tablename:t_wl_kc_phhw2024_0.json,read:0,all:-1
zzxVv commented 1 month ago

同时,再反馈个bug,使用sqlserver2008创建数据源时,出现报错java.lang.RuntimeException: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter。使用sqlserver2019可以创建。数据库为2014

baisui1981 commented 1 month ago

同时,再反馈个bug,使用sqlserver2008创建数据源时,出现报错java.lang.RuntimeException: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter。使用sqlserver2019可以创建。数据库为2014

收到,本地回归一下

baisui1981 commented 1 month ago

内存溢出的解决办法如下:

  1. 增大k8s启动tis-console 分配pod 的内存,并且,设置更大的DataX 批量同步配额,配置文件 ./tis-console.yaml:

      apiVersion: v1
      kind: Deployment
      metadata:
        name: tis-console
          spec:
            containers:
               # ......
                resources:
                  requests:
                    cpu: 1000m
                    memory: 1024Mi # 设置更大的内存配额
  2. TIS运行中设置批量执行配额:

  3. doris 设置提交是StreamLoad的最大单次传输内容Size 上限:

以上3个方法可以同时设置,看看能否避免OOM。

zzxVv commented 1 month ago

不行,三个都设置过了,还是会出现OOM。 ERROR c.a.d.c.t.runner.ReaderRunner-Reader runner Received Exceptions:java.lang.OutOfMemoryError: Java heap space 我的k8s配置为:

kind: ReplicationController
metadata:
  name: tis-console
spec:
  replicas: 1
  selector:
      app: tis-console
  template:
    metadata:
      labels:
        app: tis-console
    spec:
      containers:
        - name: tis-console
          volumeMounts:
            - name: tis-config-volume
              mountPath: /opt/app/tis-uber/web-start/conf/tis-web-config/
            - name: tis-console-pvc
              mountPath: "/opt/data"
          image: 172.16.19.106:5000/tis-4.0.1:latest
          ports:
            - name: tis-8080
              containerPort: 8080
            - name: tis-grpc
              containerPort: 56432
          resources:
            requests:
              cpu: "4"
              memory: "8Gi"
            limits:
              cpu: "8"
              memory: "16Gi"

是否可以直接调整datax的JVM内存参数?

baisui1981 commented 4 weeks ago

这个页面的参数你是设置的多少?可否发个截图

zzxVv commented 4 weeks ago

00

zzxVv commented 4 weeks ago

设置为8G后正常了,之前4G会内存溢出

baisui1981 commented 4 weeks ago

莫非sqlserver 里某个字段 是大文本?

zzxVv commented 4 weeks ago

莫非sqlserver 里某个字段 是大文本?

没有大文本,最大的字段为varchar(240)

zzxVv commented 4 weeks ago

现在在K8S创建powerjob集群时,拉取镜像出现 Warning Failed 3m51s kubelet Failed to pull image "registry.cn-hangzhou.aliyuncs.com/tis/powerjob-server:4.3.6": failed to pull and unpack image "registry.cn-hangzhou.aliyuncs.com/tis/powerjob-server:4.3.6": failed to resolve reference "registry.cn-hangzhou.aliyuncs.com/tis/powerjob-server:4.3.6": failed to do request: Head "https://registry.cn-hangzhou.aliyuncs.com/v2/tis/powerjob-server/manifests/4.3.6": dial tcp: lookup registry.cn-hangzhou.aliyuncs.com on [::1]:53: read udp [::1]:54150->[::1]:53: read: connection refused Warning Failed 3m40s (x6 over 5m18s) kubelet Error: ImagePullBackOff Normal BackOff 11s (x22 over 5m18s) kubelet Back-off pulling image "registry.cn-hangzhou.aliyuncs.com/tis/powerjob-server:4.3.6"

使用其他镜像源部署powerjob时,出现 2024-10-24 16:54:56.932 ERROR 7 --- [ main] c.z.h.p.HikariPool : HikariPool-1 - Exception during pool initialization. com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

mysql数据库部署采用的是Customized模式。