apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.02k stars 1.82k forks source link

[Bug] [Connector-V2][FLINK-Hive-Sink] Caused by: java.io.FileNotFoundException while mysql data write to hive #3203

Closed dik111 closed 2 years ago

dik111 commented 2 years ago

Search before asking

What happened

I use seatunnel to test mysql data writing to hive it threw an exception, causing data not to be written to hive

SeaTunnel Version

seatunnel-version: 2.2.0-beta flink-version:1.13.3 hive-version:3.0.0 mysql-version:5.7

SeaTunnel Config

env { 
  job.mode = "BATCH"
}
source {
    jdbc {
        driver = "com.mysql.jdbc.Driver"
        url = "jdbc:mysql://xx:3307/xx?zeroDateTimeBehavior=convertToNull&useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&useSSL=false&autoReconnect=true"
        query = "select id   from   sogal_cti_db.SG_CTI_CALL_RECORD limit 10   "
        user = "xx"
        password = "xx"
            }
}
sink {
    Hive {
        table_name = "test.in_csi_sogal_cti_db_sg_cti_call_record2"
        save_mode = "overwrite"
        metastore_uri = "thrift://xx:9083"
        sink_columns = ["ID"]
    }
}
transform{}

Running Command

bin/start-seatunnel-flink-connector-v2.sh -m yarn-cluster -ynm seatunnel --config ./config/mysql-hive-flink.conf

Error Exception

2022-10-27 14:00:10
java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/4f90db07e11241df8c9aca8ecde61199 does not exist.
    at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.<init>(BaseFileSinkWriter.java:47)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75)
    at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55)
    at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136)
    at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74)
    at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
    at java.lang.Thread.run(Thread.java:748)
    Suppressed: java.lang.NullPointerException
        at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662)
        ... 4 more
Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/4f90db07e11241df8c9aca8ecde61199 does not exist.
    at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059)
    at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
    at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119)
    at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242)
    ... 14 more

Flink or Spark Version

flink-version:1.13.3

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

dik111 commented 2 years ago

HI @EricJoy2048 @TyrantLucifer can you help me solve this problem?

TyrantLucifer commented 2 years ago

HI @EricJoy2048 @TyrantLucifer can you help me solve this problem?

Could you please offer more detailed log?

dik111 commented 2 years ago

HI @EricJoy2048 @TyrantLucifer can you help me solve this problem?

Could you please offer more detailed log? Here is Taskmanager log :


2022-10-28 14:15:20,902 INFO  org.apache.flink.metrics.influxdb.InfluxdbReporter           [] - Configured InfluxDBReporter with {host:10.10.10.139, port:8086, db:flink_metrics, retentionPolicy: and consistency:ANY}
2022-10-28 14:15:20,904 INFO  org.apache.flink.runtime.metrics.MetricRegistryImpl          [] - Periodically reporting metrics in intervals of 2 min for reporter influxdb of type org.apache.flink.metrics.influxdb.InfluxdbReporter.
2022-10-28 14:15:20,910 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils        [] - Trying to start actor system, external address jtbihdp05.sogal.com:0, bind address 0.0.0.0:0.
2022-10-28 14:15:20,924 INFO  akka.event.slf4j.Slf4jLogger                                 [] - Slf4jLogger started
2022-10-28 14:15:20,927 INFO  akka.remote.Remoting                                         [] - Starting remoting
2022-10-28 14:15:20,935 INFO  akka.remote.Remoting                                         [] - Remoting started; listening on addresses :[akka.tcp://flink-metrics@jtbihdp05.sogal.com:45954]
2022-10-28 14:15:20,970 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils        [] - Actor system started at akka.tcp://flink-metrics@jtbihdp05.sogal.com:45954
2022-10-28 14:15:20,988 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService             [] - Starting RPC endpoint for org.apache.flink.runtime.metrics.dump.MetricQueryService at akka://flink-metrics/user/rpc/MetricQueryService_container_e46_1666078799567_20758_01_000002 .
2022-10-28 14:15:21,005 INFO  org.apache.flink.runtime.blob.PermanentBlobCache             [] - Created BLOB cache storage directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1666078799567_20758/blobStore-46584d80-1668-49bd-979f-00871c671209
2022-10-28 14:15:21,008 INFO  org.apache.flink.runtime.blob.TransientBlobCache             [] - Created BLOB cache storage directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1666078799567_20758/blobStore-28181ea2-f787-46f9-839d-d796bf209c1a
2022-10-28 14:15:21,010 INFO  org.apache.flink.runtime.externalresource.ExternalResourceUtils [] - Enabled external resources: []
2022-10-28 14:15:21,011 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner      [] - Starting TaskManager with ResourceID: container_e46_1666078799567_20758_01_000002(jtbihdp05.sogal.com:45454)
2022-10-28 14:15:21,047 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerServices    [] - Temporary file directory '/data/hadoop/yarn/local/usercache/hdfs/appcache/application_1666078799567_20758': total 688 GB, usable 285 GB (41.42% usable)
2022-10-28 14:15:21,051 INFO  org.apache.flink.runtime.io.disk.FileChannelManagerImpl      [] - FileChannelManager uses directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1666078799567_20758/flink-io-de15fe44-67ae-407c-956a-ef7e316fbccd for spill files.
2022-10-28 14:15:21,062 INFO  org.apache.flink.runtime.io.network.netty.NettyConfig        [] - NettyConfig [server address: /0.0.0.0, server port: 0, ssl enabled: false, memory segment size (bytes): 32768, transport type: AUTO, number of server threads: 1 (manual), number of client threads: 1 (manual), server connect backlog: 0 (use Netty's default), client connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's default)]
2022-10-28 14:15:21,065 INFO  org.apache.flink.runtime.io.disk.FileChannelManagerImpl      [] - FileChannelManager uses directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1666078799567_20758/flink-netty-shuffle-8abd7d70-7506-4c87-8e30-9676b1d3dd29 for spill files.
2022-10-28 14:15:21,719 INFO  org.apache.flink.runtime.io.network.buffer.NetworkBufferPool [] - Allocated 1024 MB for network buffer pool (number of memory segments: 32768, bytes per segment: 32768).
2022-10-28 14:15:21,735 INFO  org.apache.flink.runtime.io.network.NettyShuffleEnvironment  [] - Starting the network environment and its components.
2022-10-28 14:15:21,815 INFO  org.apache.flink.runtime.io.network.netty.NettyClient        [] - Transport type 'auto': using EPOLL.
2022-10-28 14:15:21,817 INFO  org.apache.flink.runtime.io.network.netty.NettyClient        [] - Successful initialization (took 81 ms).
2022-10-28 14:15:21,823 INFO  org.apache.flink.runtime.io.network.netty.NettyServer        [] - Transport type 'auto': using EPOLL.
2022-10-28 14:15:21,865 INFO  org.apache.flink.runtime.io.network.netty.NettyServer        [] - Successful initialization (took 47 ms). Listening on SocketAddress /0.0.0.0:45323.
2022-10-28 14:15:21,867 INFO  org.apache.flink.runtime.taskexecutor.KvStateService         [] - Starting the kvState service and its components.
2022-10-28 14:15:21,899 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService             [] - Starting RPC endpoint for org.apache.flink.runtime.taskexecutor.TaskExecutor at akka://flink/user/rpc/taskmanager_0 .
2022-10-28 14:15:21,920 INFO  org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Start job leader service.
2022-10-28 14:15:21,921 INFO  org.apache.flink.runtime.filecache.FileCache                 [] - User file cache uses directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1666078799567_20758/flink-dist-cache-215a5c53-e55a-4487-809d-b78d97ed0d65
2022-10-28 14:15:21,923 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Connecting to ResourceManager akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/resourcemanager_*(00000000000000000000000000000000).
2022-10-28 14:15:22,140 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Resolved ResourceManager address, beginning registration
2022-10-28 14:15:22,266 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful registration at resource manager akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/resourcemanager_* under registration id 0fc620f840b4fe95a3520e4d69e5e83f.
2022-10-28 14:15:22,296 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Receive slot request c4010f729fccb4f10e340a1af04f61af for job fd6fe20dd0c5f6d712cc4805da7f0a8f from resource manager with leader id 00000000000000000000000000000000.
2022-10-28 14:15:22,303 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Allocated slot for c4010f729fccb4f10e340a1af04f61af.
2022-10-28 14:15:22,305 INFO  org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Add job fd6fe20dd0c5f6d712cc4805da7f0a8f for job leader monitoring.
2022-10-28 14:15:22,306 INFO  org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Try to register at job manager akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/jobmanager_2 with leader id 00000000-0000-0000-0000-000000000000.
2022-10-28 14:15:22,329 INFO  org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Resolved JobManager address, beginning registration
2022-10-28 14:15:22,350 INFO  org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Successful registration at job manager akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/jobmanager_2 for job fd6fe20dd0c5f6d712cc4805da7f0a8f.
2022-10-28 14:15:22,351 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Establish JobManager connection for job fd6fe20dd0c5f6d712cc4805da7f0a8f.
2022-10-28 14:15:22,354 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Offer reserved slots to the leader of job fd6fe20dd0c5f6d712cc4805da7f0a8f.
2022-10-28 14:15:22,375 INFO  org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af.
2022-10-28 14:15:22,378 INFO  org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af.
2022-10-28 14:15:22,412 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Received task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 (68d3dfec412ef9a1fc5877e30edc1d35), deploy into slot with allocation id c4010f729fccb4f10e340a1af04f61af.
2022-10-28 14:15:22,412 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 (68d3dfec412ef9a1fc5877e30edc1d35) switched from CREATED to DEPLOYING.
2022-10-28 14:15:22,416 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Loading JAR files for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 (68d3dfec412ef9a1fc5877e30edc1d35) [DEPLOYING].
2022-10-28 14:15:22,453 INFO  org.apache.flink.streaming.runtime.tasks.StreamTask          [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@5d339fa8
2022-10-28 14:15:22,461 INFO  org.apache.flink.streaming.runtime.tasks.StreamTask          [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel")
2022-10-28 14:15:23,131 INFO  org.apache.hadoop.conf.Configuration.deprecation             [] - No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS
2022-10-28 14:15:23,306 INFO  org.apache.hadoop.conf.Configuration.deprecation             [] - No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS
2022-10-28 14:15:24,513 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory      [] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2022-10-28 14:15:24,531 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 (68d3dfec412ef9a1fc5877e30edc1d35) switched from DEPLOYING to INITIALIZING.
2022-10-28 14:15:24,992 INFO  org.apache.seatunnel.translation.flink.source.BaseSeaTunnelSourceFunction [] - Consumer subtask 0 has no restore state.
2022-10-28 14:15:25,163 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 (68d3dfec412ef9a1fc5877e30edc1d35) switched from INITIALIZING to RUNNING.
2022-10-28 14:15:25,211 INFO  org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.MemoryManagerImpl [] - orc.rows.between.memory.checks=5000
2022-10-28 14:15:25,250 INFO  org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.PhysicalFsWriter [] - ORC writer created for path: /tmp/seatunnel/seatunnel/2856d9e234794d86a72d46a1de0b71a9/T_2856d9e234794d86a72d46a1de0b71a9_0_1/NON_PARTITION/T_2856d9e234794d86a72d46a1de0b71a9_0_1.orc with stripeSize: 67108864 blockSize: 268435456 compression: SNAPPY bufferSize: 262144
2022-10-28 14:15:25,526 INFO  org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.OrcCodecPool [] - Got brand-new codec SNAPPY
2022-10-28 14:15:25,560 INFO  org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.WriterImpl [] - ORC writer created for path: /tmp/seatunnel/seatunnel/2856d9e234794d86a72d46a1de0b71a9/T_2856d9e234794d86a72d46a1de0b71a9_0_1/NON_PARTITION/T_2856d9e234794d86a72d46a1de0b71a9_0_1.orc with stripeSize: 67108864 blockSize: 268435456 compression: SNAPPY bufferSize: 262144
2022-10-28 14:15:25,564 WARN  org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.MemoryManagerImpl [] - Owner thread expected Thread[Legacy Source Thread - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0,5,Flink Task Threads], got Thread[Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0,5,Flink Task Threads]
2022-10-28 14:15:25,797 INFO  org.apache.seatunnel.connectors.seatunnel.jdbc.source.JdbcSource [] - Closed the bounded jdbc source
2022-10-28 14:15:25,989 INFO  org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Committing the state for checkpoint 1
2022-10-28 14:15:25,990 INFO  org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils [] - begin rename file oldName :[/tmp/seatunnel/seatunnel/2856d9e234794d86a72d46a1de0b71a9/T_2856d9e234794d86a72d46a1de0b71a9_0_1/NON_PARTITION/T_2856d9e234794d86a72d46a1de0b71a9_0_1.orc] to newName :[/warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/T_2856d9e234794d86a72d46a1de0b71a9_0_1.orc]
2022-10-28 14:15:26,000 INFO  org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils [] - rename file :[/tmp/seatunnel/seatunnel/2856d9e234794d86a72d46a1de0b71a9/T_2856d9e234794d86a72d46a1de0b71a9_0_1/NON_PARTITION/T_2856d9e234794d86a72d46a1de0b71a9_0_1.orc] to [/warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/T_2856d9e234794d86a72d46a1de0b71a9_0_1.orc] finish
2022-10-28 14:15:55,400 WARN  org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.MemoryManagerImpl [] - Owner thread expected Thread[Legacy Source Thread - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0,5,Flink Task Threads], got Thread[Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0,5,Flink Task Threads]
2022-10-28 14:15:55,414 WARN  org.apache.flink.runtime.taskmanager.Task                    [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 (68d3dfec412ef9a1fc5877e30edc1d35) switched from RUNNING to FAILED with failure cause: java.lang.Exception: Could not perform checkpoint 2 for operator Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0.
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointAsyncInMailbox(StreamTask.java:1006)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$triggerCheckpointAsync$7(StreamTask.java:958)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:344)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:330)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
at org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.PhysicalFsWriter.padStripe(PhysicalFsWriter.java:154)
at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.PhysicalFsWriter.finalizeStripe(PhysicalFsWriter.java:369)
at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:466)
at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.WriterImpl.close(WriterImpl.java:580)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.lambda$finishAndCloseFile$0(OrcWriteStrategy.java:78)
at java.util.HashMap.forEach(HashMap.java:1289)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.finishAndCloseFile(OrcWriteStrategy.java:76)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.prepareCommit(AbstractWriteStrategy.java:194)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.prepareCommit(BaseFileSinkWriter.java:68)
at org.apache.seatunnel.translation.flink.sink.FlinkSinkWriter.prepareCommit(FlinkSinkWriter.java:59)
at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.prepareSnapshotPreBarrier(AbstractSinkWriterOperator.java:86)
at org.apache.flink.streaming.runtime.tasks.OperatorChain.prepareSnapshotPreBarrier(OperatorChain.java:415)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:292)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$8(StreamTask.java:1092)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1076)
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointAsyncInMailbox(StreamTask.java:994)
... 13 more

2022-10-28 14:15:55,415 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 (68d3dfec412ef9a1fc5877e30edc1d35). 2022-10-28 14:15:55,421 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0 68d3dfec412ef9a1fc5877e30edc1d35. 2022-10-28 14:15:56,494 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:56,497 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#1 (cb535d1246d250212d34b6a272878d25), deploy into slot with allocation id c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:56,498 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#1 (cb535d1246d250212d34b6a272878d25) switched from CREATED to DEPLOYING. 2022-10-28 14:15:56,499 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#1 (cb535d1246d250212d34b6a272878d25) [DEPLOYING]. 2022-10-28 14:15:56,500 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@3042f758 2022-10-28 14:15:56,501 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel") 2022-10-28 14:15:56,501 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#1 (cb535d1246d250212d34b6a272878d25) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:15:56,569 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#1 (cb535d1246d250212d34b6a272878d25) switched from INITIALIZING to FAILED with failure cause: java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.(BaseFileSinkWriter.java:47) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75) at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55) at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136) at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74) at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103) at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662) ... 4 more Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242) ... 14 more

2022-10-28 14:15:56,569 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#1 (cb535d1246d250212d34b6a272878d25). 2022-10-28 14:15:56,572 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#1 cb535d1246d250212d34b6a272878d25. 2022-10-28 14:15:57,620 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:57,636 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#2 (fcd6d968d1d089de7b9a0db9d520bcc3), deploy into slot with allocation id c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:57,637 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#2 (fcd6d968d1d089de7b9a0db9d520bcc3) switched from CREATED to DEPLOYING. 2022-10-28 14:15:57,637 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#2 (fcd6d968d1d089de7b9a0db9d520bcc3) [DEPLOYING]. 2022-10-28 14:15:57,638 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@61c82680 2022-10-28 14:15:57,638 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel") 2022-10-28 14:15:57,639 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#2 (fcd6d968d1d089de7b9a0db9d520bcc3) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:15:57,669 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#2 (fcd6d968d1d089de7b9a0db9d520bcc3) switched from INITIALIZING to FAILED with failure cause: java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.(BaseFileSinkWriter.java:47) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75) at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55) at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136) at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74) at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103) at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662) ... 4 more Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242) ... 14 more

2022-10-28 14:15:57,669 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#2 (fcd6d968d1d089de7b9a0db9d520bcc3). 2022-10-28 14:15:57,671 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#2 fcd6d968d1d089de7b9a0db9d520bcc3. 2022-10-28 14:15:58,717 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:58,720 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#3 (59b21aeab6a261288e98cec23899fa5e), deploy into slot with allocation id c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:58,721 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#3 (59b21aeab6a261288e98cec23899fa5e) switched from CREATED to DEPLOYING. 2022-10-28 14:15:58,721 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#3 (59b21aeab6a261288e98cec23899fa5e) [DEPLOYING]. 2022-10-28 14:15:58,722 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@3e08a199 2022-10-28 14:15:58,722 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel") 2022-10-28 14:15:58,723 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#3 (59b21aeab6a261288e98cec23899fa5e) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:15:58,746 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#3 (59b21aeab6a261288e98cec23899fa5e) switched from INITIALIZING to FAILED with failure cause: java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.(BaseFileSinkWriter.java:47) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75) at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55) at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136) at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74) at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103) at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662) ... 4 more Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242) ... 14 more

2022-10-28 14:15:58,746 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#3 (59b21aeab6a261288e98cec23899fa5e). 2022-10-28 14:15:58,748 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#3 59b21aeab6a261288e98cec23899fa5e. 2022-10-28 14:15:59,776 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:59,778 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#4 (978ff194843abd7f8ff8a478718dce1d), deploy into slot with allocation id c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:15:59,778 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#4 (978ff194843abd7f8ff8a478718dce1d) switched from CREATED to DEPLOYING. 2022-10-28 14:15:59,779 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#4 (978ff194843abd7f8ff8a478718dce1d) [DEPLOYING]. 2022-10-28 14:15:59,780 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@1090b7d7 2022-10-28 14:15:59,780 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel") 2022-10-28 14:15:59,781 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#4 (978ff194843abd7f8ff8a478718dce1d) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:15:59,800 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#4 (978ff194843abd7f8ff8a478718dce1d) switched from INITIALIZING to FAILED with failure cause: java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.(BaseFileSinkWriter.java:47) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75) at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55) at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136) at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74) at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103) at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662) ... 4 more Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242) ... 14 more

2022-10-28 14:15:59,801 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#4 (978ff194843abd7f8ff8a478718dce1d). 2022-10-28 14:15:59,802 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#4 978ff194843abd7f8ff8a478718dce1d. 2022-10-28 14:16:00,851 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:16:00,853 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#5 (cbf6e47796c0ab85745e1b6c6d25327a), deploy into slot with allocation id c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:16:00,854 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#5 (cbf6e47796c0ab85745e1b6c6d25327a) switched from CREATED to DEPLOYING. 2022-10-28 14:16:00,854 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#5 (cbf6e47796c0ab85745e1b6c6d25327a) [DEPLOYING]. 2022-10-28 14:16:00,855 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@267adad6 2022-10-28 14:16:00,855 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel") 2022-10-28 14:16:00,856 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#5 (cbf6e47796c0ab85745e1b6c6d25327a) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:16:00,883 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#5 (cbf6e47796c0ab85745e1b6c6d25327a) switched from INITIALIZING to FAILED with failure cause: java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.(BaseFileSinkWriter.java:47) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75) at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55) at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136) at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74) at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103) at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662) ... 4 more Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242) ... 14 more

2022-10-28 14:16:00,884 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#5 (cbf6e47796c0ab85745e1b6c6d25327a). 2022-10-28 14:16:00,886 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#5 cbf6e47796c0ab85745e1b6c6d25327a. 2022-10-28 14:16:01,926 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:16:01,928 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#6 (64a9f983e455b6d8ff4a8ca75075077c), deploy into slot with allocation id c4010f729fccb4f10e340a1af04f61af. 2022-10-28 14:16:01,928 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#6 (64a9f983e455b6d8ff4a8ca75075077c) switched from CREATED to DEPLOYING. 2022-10-28 14:16:01,929 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#6 (64a9f983e455b6d8ff4a8ca75075077c) [DEPLOYING]. 2022-10-28 14:16:01,930 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@41fc1e7c 2022-10-28 14:16:01,930 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel") 2022-10-28 14:16:01,930 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#6 (64a9f983e455b6d8ff4a8ca75075077c) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:16:01,954 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#6 (64a9f983e455b6d8ff4a8ca75075077c) switched from INITIALIZING to FAILED with failure cause: java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.(BaseFileSinkWriter.java:47) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75) at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55) at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136) at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74) at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103) at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662) ... 4 more Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242) ... 14 more

2022-10-28 14:16:01,955 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#6 (64a9f983e455b6d8ff4a8ca75075077c). 2022-10-28 14:16:01,956 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#6 64a9f983e455b6d8ff4a8ca75075077c.


And here is job manager log:

2022-10-28 14:14:59,584 INFO org.apache.flink.metrics.influxdb.InfluxdbReporter [] - Configured InfluxDBReporter with {host:10.10.10.139, port:8086, db:flink_metrics, retentionPolicy: and consistency:ANY} 2022-10-28 14:14:59,586 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl [] - Periodically reporting metrics in intervals of 2 min for reporter influxdb of type org.apache.flink.metrics.influxdb.InfluxdbReporter. 2022-10-28 14:14:59,591 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils [] - Trying to start actor system, external address jtbihdp01.sogal.com:0, bind address 0.0.0.0:0. 2022-10-28 14:14:59,601 INFO akka.event.slf4j.Slf4jLogger [] - Slf4jLogger started 2022-10-28 14:14:59,624 INFO akka.remote.Remoting [] - Starting remoting 2022-10-28 14:14:59,629 INFO akka.remote.Remoting [] - Remoting started; listening on addresses :[akka.tcp://flink-metrics@jtbihdp01.sogal.com:36653] 2022-10-28 14:14:59,652 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils [] - Actor system started at akka.tcp://flink-metrics@jtbihdp01.sogal.com:36653 2022-10-28 14:14:59,662 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for org.apache.flink.runtime.metrics.dump.MetricQueryService at akka://flink-metrics/user/rpc/MetricQueryService . 2022-10-28 14:14:59,700 WARN org.apache.flink.configuration.Configuration [] - Config uses deprecated configuration key 'web.port' instead of proper key 'rest.bind-port' 2022-10-28 14:14:59,701 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Upload directory /tmp/flink-web-fe0a7160-728a-4f2f-bcc1-1184e5f892a8/flink-web-upload does not exist. 2022-10-28 14:14:59,702 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Created directory /tmp/flink-web-fe0a7160-728a-4f2f-bcc1-1184e5f892a8/flink-web-upload for file uploads. 2022-10-28 14:14:59,714 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Starting rest endpoint. 2022-10-28 14:14:59,997 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils [] - Determined location of main cluster component log file: /data/hadoop/yarn/log/application_1666078799567_20758/container_e46_1666078799567_20758_01_000001/jobmanager.log 2022-10-28 14:14:59,998 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils [] - Determined location of main cluster component stdout file: /data/hadoop/yarn/log/application_1666078799567_20758/container_e46_1666078799567_20758_01_000001/jobmanager.out 2022-10-28 14:15:00,123 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Rest endpoint listening at jtbihdp01.sogal.com:34323 2022-10-28 14:15:00,123 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - http://jtbihdp01.sogal.com:34323 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000 2022-10-28 14:15:00,124 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Web frontend listening at http://jtbihdp01.sogal.com:34323. 2022-10-28 14:15:00,135 INFO org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (3.000gb (3221225520 bytes)) is greater than its max value 1024.000mb (1073741824 bytes), max value will be used instead 2022-10-28 14:15:00,136 INFO org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction network memory (2.875gb (3087007790 bytes)) is greater than its max value 1024.000mb (1073741824 bytes), max value will be used instead 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: internal.jobgraph-path, job.graph 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.interval, 120 SECONDS 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: classloader.check-leaked-classloader, false 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.host, 10.10.10.139 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.execution.failover-strategy, region 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: high-availability.cluster-id, application_1666078799567_20758 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.rpc.address, localhost 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: state.savepoints.dir, hdfs://hacluster/flink/checkpoint/cdc-test 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.runtime-mode, STREAMING 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.db, flink_metrics 2022-10-28 14:15:00,184 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.connectTimeout, 60000 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.savepoint.ignore-unclaimed-state, false 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: parallelism.default, 1 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: taskmanager.numberOfTaskSlots, 1 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: pipeline.classpaths, file:/data/software/seatunnel/apache-seatunnel-incubating-2.2.0-SNAPSHOT/connectors/seatunnel/connector-jdbc-2.2.0-SNAPSHOT.jar;file:/data/software/seatunnel/apache-seatunnel-incubating-2.2.0-SNAPSHOT/connectors/seatunnel/connector-hive-2.2.0-SNAPSHOT-2.11.12.jar 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: yarn.application.name, seatunnel 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: taskmanager.memory.process.size, 30g 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: classloader.resolve-order, parent-first 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.scheme, http 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.target, yarn-per-job 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.memory.process.size, 1024m 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.rpc.port, 6123 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobstore.expiration-time, 432000 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.checkpointing.interval, 30000 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.port, 8086 2022-10-28 14:15:00,185 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.attached, true 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: internal.cluster.execution-mode, NORMAL 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.writeTimeout, 60000 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.shutdown-on-attached-exit, false 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: pipeline.jars, file:/data/software/seatunnel/apache-seatunnel-incubating-2.2.0-SNAPSHOT/lib/seatunnel-flink-starter.jar;file:/data/software/seatunnel/apache-seatunnel-incubating-2.2.0-SNAPSHOT/connectors/seatunnel/connector-jdbc-2.2.0-SNAPSHOT.jar;file:/data/software/seatunnel/apache-seatunnel-incubating-2.2.0-SNAPSHOT/connectors/seatunnel/connector-hive-2.2.0-SNAPSHOT-2.11.12.jar 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: taskmanager.memory.managed.fraction, 0.05 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.consistency, ANY 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: metrics.reporter.influxdb.factory.class, org.apache.flink.metrics.influxdb.InfluxdbReporterFactory 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: $internal.deployment.config-dir, /data/software/flink/test/flink-1.13.3/conf 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: $internal.yarn.log-config-file, /data/software/flink/test/flink-1.13.3/conf/log4j.properties 2022-10-28 14:15:00,186 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: state.checkpoints.dir, hdfs://hacluster/flink/checkpoint/seatunnel 2022-10-28 14:15:00,273 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager at akka://flink/user/rpc/resourcemanager_0 . 2022-10-28 14:15:00,323 INFO org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherRunner [] - DefaultDispatcherRunner was granted leadership with leader id 00000000-0000-0000-0000-000000000000. Creating new DispatcherLeaderProcess. 2022-10-28 14:15:00,327 INFO org.apache.flink.runtime.dispatcher.runner.JobDispatcherLeaderProcess [] - Start JobDispatcherLeaderProcess. 2022-10-28 14:15:00,332 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for org.apache.flink.runtime.dispatcher.MiniDispatcher at akka://flink/user/rpc/dispatcher_1 . 2022-10-28 14:15:00,347 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Starting the resource manager. 2022-10-28 14:15:00,516 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for org.apache.flink.runtime.jobmaster.JobMaster at akka://flink/user/rpc/jobmanager2 . 2022-10-28 14:15:00,523 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Initializing job seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f). 2022-10-28 14:15:00,604 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Using restart back off time strategy FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=2147483647, backoffTimeMS=1000) for seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f). 2022-10-28 14:15:00,787 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Running initialization on master for job seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f). 2022-10-28 14:15:00,787 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Successfully ran initialization on master in 0 ms. 2022-10-28 14:15:00,881 INFO org.apache.flink.runtime.scheduler.adapter.DefaultExecutionTopology [] - Built 1 pipelined regions in 28 ms 2022-10-28 14:15:00,898 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@26bd2480 2022-10-28 14:15:00,902 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Checkpoint storage is set to 'filesystem': (checkpoints "hdfs://hacluster/flink/checkpoint/seatunnel") 2022-10-28 14:15:01,639 INFO org.apache.hadoop.conf.Configuration.deprecation [] - No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS 2022-10-28 14:15:02,174 INFO org.apache.hadoop.conf.Configuration.deprecation [] - No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS 2022-10-28 14:15:03,173 WARN org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory [] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 2022-10-28 14:15:03,217 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - No checkpoint found during restore. 2022-10-28 14:15:03,243 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Using failover strategy org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy@2bc9ad4b for seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f). 2022-10-28 14:15:03,245 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Recovered 0 containers from previous attempts ([]). 2022-10-28 14:15:03,245 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Recovered 0 workers from previous attempt. 2022-10-28 14:15:03,263 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Starting execution of job seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f) under job master id 00000000000000000000000000000000. 2022-10-28 14:15:03,264 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Starting scheduling with scheduling strategy [org.apache.flink.runtime.scheduler.strategy.PipelinedRegionSchedulingStrategy] 2022-10-28 14:15:03,265 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f) switched from state CREATED to RUNNING. 2022-10-28 14:15:03,272 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (68d3dfec412ef9a1fc5877e30edc1d35) switched from CREATED to SCHEDULED. 2022-10-28 14:15:03,294 INFO org.apache.hadoop.conf.Configuration [] - found resource resource-types.xml at file:/etc/hadoop/3.1.0.0-78/0/resource-types.xml 2022-10-28 14:15:03,368 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Connecting to ResourceManager akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/resourcemanager*(00000000000000000000000000000000) 2022-10-28 14:15:03,398 INFO org.apache.flink.runtime.externalresource.ExternalResourceUtils [] - Enabled external resources: [] 2022-10-28 14:15:03,403 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl [] - Upper bound of the thread pool size is 500 2022-10-28 14:15:03,407 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - ResourceManager akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/resourcemanager_0 was granted leadership with fencing token 00000000000000000000000000000000 2022-10-28 14:15:03,417 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Resolved ResourceManager address, beginning registration 2022-10-28 14:15:03,420 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Registering job manager 00000000000000000000000000000000@akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/jobmanager_2 for job fd6fe20dd0c5f6d712cc4805da7f0a8f. 2022-10-28 14:15:03,423 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Registered job manager 00000000000000000000000000000000@akka.tcp://flink@jtbihdp01.sogal.com:35554/user/rpc/jobmanager_2 for job fd6fe20dd0c5f6d712cc4805da7f0a8f. 2022-10-28 14:15:03,426 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000. 2022-10-28 14:15:03,428 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Received resource requirements from job fd6fe20dd0c5f6d712cc4805da7f0a8f: [ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1}] 2022-10-28 14:15:03,444 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Requesting new worker with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=26.062gb (27984396265 bytes), taskOffHeapSize=0 bytes, networkMemSize=1024.000mb (1073741824 bytes), managedMemSize=1.438gb (1543503895 bytes), numSlots=1}, current pending count: 1. 2022-10-28 14:15:03,504 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Requesting new TaskExecutor container with resource TaskExecutorProcessSpec {cpuCores=1.0, frameworkHeapSize=128.000mb (134217728 bytes), frameworkOffHeapSize=128.000mb (134217728 bytes), taskHeapSize=26.062gb (27984396265 bytes), taskOffHeapSize=0 bytes, networkMemSize=1024.000mb (1073741824 bytes), managedMemorySize=1.438gb (1543503895 bytes), jvmMetaspaceSize=256.000mb (268435456 bytes), jvmOverheadSize=1024.000mb (1073741824 bytes), numSlots=1}, priority 1. 2022-10-28 14:15:04,820 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Received 1 containers. 2022-10-28 14:15:04,821 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Received 1 containers with priority 1, 1 pending container requests. 2022-10-28 14:15:04,826 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Removing container request Capability[<memory:30720, vCores:1>]Priority[1]AllocationRequestId[0]ExecutionTypeRequest[{Execution Type: GUARANTEED, Enforce Execution Type: false}]Resource Profile[null]. 2022-10-28 14:15:04,827 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Accepted 1 requested containers, returned 0 excess containers, 0 pending container requests of resource <memory:30720, vCores:1>. 2022-10-28 14:15:04,827 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - TaskExecutor container_e46_1666078799567_20758_01_000002(jtbihdp05.sogal.com:45454) will be started on jtbihdp05.sogal.com with TaskExecutorProcessSpec {cpuCores=1.0, frameworkHeapSize=128.000mb (134217728 bytes), frameworkOffHeapSize=128.000mb (134217728 bytes), taskHeapSize=26.062gb (27984396265 bytes), taskOffHeapSize=0 bytes, networkMemSize=1024.000mb (1073741824 bytes), managedMemorySize=1.438gb (1543503895 bytes), jvmMetaspaceSize=256.000mb (268435456 bytes), jvmOverheadSize=1024.000mb (1073741824 bytes), numSlots=1}. 2022-10-28 14:15:04,894 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Creating container launch context for TaskManagers 2022-10-28 14:15:04,896 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Starting TaskManagers 2022-10-28 14:15:04,919 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Requested worker container_e46_1666078799567_20758_01_000002(jtbihdp05.sogal.com:45454) with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=26.062gb (27984396265 bytes), taskOffHeapSize=0 bytes, networkMemSize=1024.000mb (1073741824 bytes), managedMemSize=1.438gb (1543503895 bytes), numSlots=1}. 2022-10-28 14:15:04,919 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl [] - Processing Event EventType: START_CONTAINER for Container container_e46_1666078799567_20758_01_000002 2022-10-28 14:15:21,856 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Registering TaskManager with ResourceID container_e46_1666078799567_20758_01_000002(jtbihdp05.sogal.com:45454) (akka.tcp://flink@jtbihdp05.sogal.com:45497/user/rpc/taskmanager_0) at ResourceManager 2022-10-28 14:15:21,914 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Worker container_e46_1666078799567_20758_01_000002(jtbihdp05.sogal.com:45454) is registered. 2022-10-28 14:15:21,914 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Worker container_e46_1666078799567_20758_01_000002(jtbihdp05.sogal.com:45454) with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=26.062gb (27984396265 bytes), taskOffHeapSize=0 bytes, networkMemSize=1024.000mb (1073741824 bytes), managedMemSize=1.438gb (1543503895 bytes), numSlots=1} was requested in current attempt. Current pending count after registering: 0. 2022-10-28 14:15:21,987 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (68d3dfec412ef9a1fc5877e30edc1d35) switched from SCHEDULED to DEPLOYING. 2022-10-28 14:15:21,988 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (attempt #0) with attempt id 68d3dfec412ef9a1fc5877e30edc1d35 to container_e46_1666078799567_20758_01_000002 @ jtbihdp05.sogal.com (dataPort=45323) with allocation id c4010f729fccb4f10e340a1af04f61af 2022-10-28 14:15:24,159 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (68d3dfec412ef9a1fc5877e30edc1d35) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:15:24,788 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (68d3dfec412ef9a1fc5877e30edc1d35) switched from INITIALIZING to RUNNING. 2022-10-28 14:15:25,071 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 1 (type=CHECKPOINT) @ 1666937725016 for job fd6fe20dd0c5f6d712cc4805da7f0a8f. 2022-10-28 14:15:25,604 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed checkpoint 1 for job fd6fe20dd0c5f6d712cc4805da7f0a8f (2232 bytes in 584 ms). 2022-10-28 14:15:55,018 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 2 (type=CHECKPOINT) @ 1666937755014 for job fd6fe20dd0c5f6d712cc4805da7f0a8f. 2022-10-28 14:15:55,054 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (68d3dfec412ef9a1fc5877e30edc1d35) switched from RUNNING to FAILED on container_e46_1666078799567_20758_01_000002 @ jtbihdp05.sogal.com (dataPort=45323). java.lang.Exception: Could not perform checkpoint 2 for operator Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1)#0. at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointAsyncInMailbox(StreamTask.java:1006) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$triggerCheckpointAsync$7(StreamTask.java:958) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:344) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:330) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_281] Caused by: java.lang.NullPointerException at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.PhysicalFsWriter.padStripe(PhysicalFsWriter.java:154) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.PhysicalFsWriter.finalizeStripe(PhysicalFsWriter.java:369) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:466) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.shade.connector.hive.org.apache.orc.impl.WriterImpl.close(WriterImpl.java:580) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.lambda$finishAndCloseFile$0(OrcWriteStrategy.java:78) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at java.util.HashMap.forEach(HashMap.java:1289) ~[?:1.8.0_281] at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.finishAndCloseFile(OrcWriteStrategy.java:76) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.prepareCommit(AbstractWriteStrategy.java:194) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.prepareCommit(BaseFileSinkWriter.java:68) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.translation.flink.sink.FlinkSinkWriter.prepareCommit(FlinkSinkWriter.java:59) ~[seatunnel-flink-starter.jar:2.2.0-SNAPSHOT] at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.prepareSnapshotPreBarrier(AbstractSinkWriterOperator.java:86) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.OperatorChain.prepareSnapshotPreBarrier(OperatorChain.java:415) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:292) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$8(StreamTask.java:1092) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1076) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointAsyncInMailbox(StreamTask.java:994) ~[flink-dist_2.11-1.13.3.jar:1.13.3] ... 13 more 2022-10-28 14:15:55,077 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Clearing resource requirements of job fd6fe20dd0c5f6d712cc4805da7f0a8f 2022-10-28 14:15:55,079 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy [] - Calculating tasks to restart to recover the failed task cbc357ccb763df2852fee8c4fc7d55f2_0. 2022-10-28 14:15:55,080 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy [] - 1 tasks should be restarted to recover the failed task cbc357ccb763df2852fee8c4fc7d55f2_0. 2022-10-28 14:15:55,081 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f) switched from state RUNNING to RESTARTING. 2022-10-28 14:15:56,091 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f) switched from state RESTARTING to RUNNING. 2022-10-28 14:15:56,093 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Restoring job fd6fe20dd0c5f6d712cc4805da7f0a8f from Checkpoint 1 @ 1666937725016 for fd6fe20dd0c5f6d712cc4805da7f0a8f located at hdfs://hacluster/flink/checkpoint/seatunnel/fd6fe20dd0c5f6d712cc4805da7f0a8f/chk-1. 2022-10-28 14:15:56,104 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - No master state to restore 2022-10-28 14:15:56,104 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (cb535d1246d250212d34b6a272878d25) switched from CREATED to SCHEDULED. 2022-10-28 14:15:56,106 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (cb535d1246d250212d34b6a272878d25) switched from SCHEDULED to DEPLOYING. 2022-10-28 14:15:56,106 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (attempt #1) with attempt id cb535d1246d250212d34b6a272878d25 to container_e46_1666078799567_20758_01_000002 @ jtbihdp05.sogal.com (dataPort=45323) with allocation id c4010f729fccb4f10e340a1af04f61af 2022-10-28 14:15:56,106 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Received resource requirements from job fd6fe20dd0c5f6d712cc4805da7f0a8f: [ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1}] 2022-10-28 14:15:56,151 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (cb535d1246d250212d34b6a272878d25) switched from DEPLOYING to INITIALIZING. 2022-10-28 14:15:56,229 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: SeaTunnel JdbcSource -> Map -> to: Row -> Sink Writer: Hive -> Sink Global Committer: Hive (1/1) (cb535d1246d250212d34b6a272878d25) switched from INITIALIZING to FAILED on container_e46_1666078799567_20758_01_000002 @ jtbihdp05.sogal.com (dataPort=45323). java.lang.RuntimeException: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:245) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.(BaseFileSinkWriter.java:47) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSink.restoreWriter(BaseFileSink.java:75) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.translation.flink.sink.FlinkSink.createWriter(FlinkSink.java:55) ~[seatunnel-flink-starter.jar:2.2.0-SNAPSHOT] at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterOperator.createWriter(StatefulSinkWriterOperator.java:136) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.open(AbstractSinkWriterOperator.java:74) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_281] Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.AbstractSinkWriterOperator.dispose(AbstractSinkWriterOperator.java:103) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:864) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:843) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:756) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:662) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) ~[flink-dist_2.11-1.13.3.jar:1.13.3] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_281] Caused by: java.io.FileNotFoundException: File /warehouse/tablespace/managed/hive/test.db/in_csi_sogal_cti_db_sg_cti_call_record2/seatunnel/2856d9e234794d86a72d46a1de0b71a9 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?] at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.dirList(FileSystemUtils.java:124) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.AbstractWriteStrategy.getTransactionIdFromStates(AbstractWriteStrategy.java:242) ~[connector-hive-2.2.0-SNAPSHOT-2.11.12.jar:2.2.0-SNAPSHOT] ... 14 more 2022-10-28 14:15:56,233 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Clearing resource requirements of job fd6fe20dd0c5f6d712cc4805da7f0a8f 2022-10-28 14:15:56,234 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy [] - Calculating tasks to restart to recover the failed task cbc357ccb763df2852fee8c4fc7d55f2_0. 2022-10-28 14:15:56,234 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy [] - 1 tasks should be restarted to recover the failed task cbc357ccb763df2852fee8c4fc7d55f2_0. 2022-10-28 14:15:56,234 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job seatunnel (fd6fe20dd0c5f6d712cc4805da7f0a8f) switched from state RUNNING to RESTARTING.


Any other detail log I can offer to you ?
TyrantLucifer commented 2 years ago

It's enough. BTW, could you please offer some example data from mysql and the hive table information, I need in the local simulation repetition. Thank you.

dik111 commented 2 years ago

It's enough. BTW, could you please offer some example data from mysql and the hive table information, I need in the local simulation repetition. Thank you.

Here is MYSQL table:

CREATE TABLE `sg_cti_call_record` (
  `ID` varchar(40) NOT NULL,
  `CREATED_USER` varchar(40) DEFAULT NULL,
  `CREATED_TIME` datetime DEFAULT NULL,
  `LAST_UPDATE_USER` varchar(40) DEFAULT NULL,
  `LAST_UPDATE_TIME` datetime DEFAULT NULL,
  `VERSION_NUMBER` int(11) DEFAULT '0',
  `RECORD_STATUS` varchar(40) DEFAULT 'VALID',
  `CALL_ID` varchar(40) DEFAULT NULL COMMENT '',
  `MAIN_CALLID` varchar(60) DEFAULT NULL COMMENT '',
  `CALLED_NUMBER` varchar(15) DEFAULT NULL COMMENT '',
  `CALLING_NUMBER` varchar(15) DEFAULT NULL COMMENT '',
  `CNO` varchar(40) DEFAULT NULL COMMENT '',
  `GROUP_ID` varchar(40) DEFAULT NULL COMMENT '',
  `TOTAL_DURATION` int(11) DEFAULT NULL COMMENT '',
  `TALK_TIME_LONG` int(11) DEFAULT '0' COMMENT '',
  `CONTENT` varchar(255) DEFAULT NULL COMMENT '',
  `CALL_TYPE` varchar(20) DEFAULT NULL COMMENT '',
  `STATUS` varchar(40) DEFAULT NULL,
  `START_TIME` datetime DEFAULT NULL COMMENT '',
  `ANSWER_TIME` datetime DEFAULT NULL COMMENT '',
  `END_TIME` datetime DEFAULT NULL COMMENT '',
  `IVR_KEY` varchar(30) DEFAULT NULL COMMENT '',
  `RING_TIME_LONG` int(11) DEFAULT '0' COMMENT '',
  `SATISFACTION` varchar(255) CHARACTER SET utf8mb4 DEFAULT NULL COMMENT '',
  `WORD_PROCESSING_LENGTH` int(11) DEFAULT NULL,
  `HANG_UP_REASON` varchar(255) DEFAULT NULL,
  `RECORD_URL` varchar(255) DEFAULT NULL,
  `RECORD_FILE_TYPE` varchar(20) DEFAULT NULL COMMENT '',
  `LOCAL_RECORD_FILE` varchar(255) DEFAULT NULL COMMENT '',
  `QNO` varchar(10) DEFAULT NULL COMMENT '',
  `QUEUE_NAME` varchar(30) DEFAULT NULL COMMENT '',
  `END_REASON` varchar(10) DEFAULT NULL,
  `SUBMIT_TIME` datetime DEFAULT NULL,
  PRIMARY KEY (`ID`),
  KEY `CALL_ID` (`CALL_ID`) USING BTREE,
  KEY `CREATED_TIME` (`CREATED_TIME`),
  KEY `START_TIME` (`START_TIME`),
  KEY `END_TIME` (`END_TIME`),
  KEY `ANSWER_TIME` (`ANSWER_TIME`),
  KEY `MAIN_CALLID` (`MAIN_CALLID`),
  KEY `QNO` (`QNO`),
  KEY `RECORD_URL` (`RECORD_URL`(18)),
  KEY `idx_sg_cti_call_record_y01` (`RECORD_FILE_TYPE`,`LOCAL_RECORD_FILE`),
  KEY `idx_CNO_START_TIME` (`CNO`,`START_TIME`) USING BTREE,
  KEY `idx_lastupdate` (`LAST_UPDATE_TIME`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;

MYSQL example data:

INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('000000db55d8432b90f318c8052dfe96');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('0000035ccd044d0f93d67cdd7da4ae95');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('000003d5e9914a9c9203430632bd7b8e');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('0000046480924703bb754f49a6fe461f');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('00000504d1e04b2197a6b4d707945f5e');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('0000059d7bb447d78810e3d5725da0c7');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('000005ada16f4eb4889d90d2b23f4991');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('0000068dba0b432bb1e525c6eba5ff02');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('00000894fdd346d2be5ee1e1713b40de');
INSERT INTO `sg_cti_call_record` (`ID`) VALUES ('00000b245fea4e9983c9fbf736d9d952');

And here is hive table:

CREATE TABLE test.in_csi_sogal_cti_db_sg_cti_call_record2 (
 id VARCHAR(40) )
  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS orcfile ;

And here is seatunnel config file:

env {
    job.mode = "BATCH"
}
source {
    jdbc {
        driver = "com.mysql.jdbc.Driver"
        url = "jdbc:mysql://xx:3306/sogal_cti_db?zeroDateTimeBehavior=convertToNull&useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&useSSL=false&autoReconnect=true"
        query = "select id   from   sogal_cti_db.SG_CTI_CALL_RECORD limit 10   "
        result_table_name = "source_table"
        user = "xx"
        password = "xx"
            }
}
sink {
    Hive {
        source_table_name = "source_table"
        table_name = "test.in_csi_sogal_cti_db_sg_cti_call_record2"
        save_mode = "overwrite"
        metastore_uri = "thrift://xx:9083"
    }
}
transform{} 
dik111 commented 2 years ago

HI @TyrantLucifer any idea about this bug?

TyrantLucifer commented 2 years ago

HI @TyrantLucifer any idea about this bug?

Sorry, I'm busy these days. I'll let you know tonight.

TyrantLucifer commented 2 years ago

HI @TyrantLucifer any idea about this bug?

Could you please add my wechat: tyrantlucifer ? So we can communicate more conveniently