apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.82k stars 1.76k forks source link

[Bug] [hdfs to s3] java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found #4406

Open zhangzhaohuazai opened 1 year ago

zhangzhaohuazai commented 1 year ago

Search before asking

What happened

when I try to sink s3 file from hdfs use seatunnel engine ,I got this error:Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found

SeaTunnel Version

2.3.0

SeaTunnel Config

env {
  execution.parallelism = 1
  // job.mode = "BATCH"
}

source {
  HdfsFile {
    result_table_name = "mysql_users2"
    // schema {
    //   fields {
    //     name = string
    //     age = int
    //   }
    // }
    path = "/test/input"
    type = "text"
    fs.defaultFS = "hdfs://***:9000"
  }
}

transform {

}

sink{
  S3File {
    source_table_name="mysql_users2"
    access_key = "***"
    secret_key = "***"
    bucket = "s3a://***:9000"
    tmp_path = "/tmp/seatunnel"
    path="/seatunnel/text"
    row_delimiter="\n"
    partition_dir_expression="${k0}=${v0}"
    is_partition_field_write_in_file=true
    file_name_expression="${transactionId}_${now}"
    file_format="text"
    filename_time_format="yyyy.MM.dd"
    is_enable_transaction=true
    hadoop_s3_properties {
       "fs.s3a.aws.credentials.provider" = "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
    }
  }
}

Running Command

./bin/seatunnel.sh --config ./config/hdfs_s3.conf.template -e local

Error Exception

2023-03-24 09:47:31,750 INFO  org.apache.seatunnel.engine.server.TaskExecutionService - [localhost]:5801 [seatunnel_default_cluster-145368] [5.1] Task TaskGroupLocation{jobId=691463097877004289, pipelineId=1, taskGroupId=50000} complete with state FAILED
2023-03-24 09:47:31,750 INFO  org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job SeaTunnel (691463097877004289), Pipeline: [(1/1)], task: [HdfsFile-SourceTask (1/1)] turn to end state FAILED.
2023-03-24 09:47:31,750 ERROR org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job SeaTunnel (691463097877004289), Pipeline: [(1/1)], task: [HdfsFile-SourceTask (1/1)] end with state FAILED and Exception: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
    at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:166)
    at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:57)
    at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
    at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
    at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.handleRecord(IntermediateQueueFlowLifeCycle.java:77)
    at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:58)
    at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:65)
    at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:157)
    at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:71)
    at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:357)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2638)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3269)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3301)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:227)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:463)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getFileSystem(FileSystemUtils.java:63)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getOutputStream(FileSystemUtils.java:69)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.getOrCreateOutputStream(TextWriteStrategy.java:114)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.write(TextWriteStrategy.java:75)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:108)
    at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43)
    at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:161)
    ... 14 more

Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

image

Are you willing to submit PR?

Code of Conduct

zhangzhaohuazai commented 1 year ago

this is my connector in /seatunnel/connectors/seatunnel: image

zhangzhaohuazai commented 1 year ago

by the way, I use minIO for s3,but I didn't find "endpoint" param in s3file sink part of Official website: image https://seatunnel.apache.org/docs/2.3.0/connector-v2/sink/S3File

zhangzhaohuazai commented 1 year ago

whne I add these jar to plugins,I got new error: 2023-03-24 11:36:34,319 INFO org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job SeaTunnel (691490536757919745), Pipeline: [(1/1)], task: [HdfsFile-SourceTask (1/1)] turn to end state FAILED. 2023-03-24 11:36:34,319 ERROR org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job SeaTunnel (691490536757919745), Pipeline: [(1/1)], task: [HdfsFile-SourceTask (1/1)] end with state FAILED and Exception: java.lang.NoSuchMethodError: org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(Ljava/util/concurrent/ExecutorService;IZ)V at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:824) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1057) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getOutputStream(FileSystemUtils.java:71) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.getOrCreateOutputStream(TextWriteStrategy.java:114) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.write(TextWriteStrategy.java:75) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:108) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43) at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:161) at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:57) at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39) at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27) at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.handleRecord(IntermediateQueueFlowLifeCycle.java:77) at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:58) at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:65) at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:157) at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:71) at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:357) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) image

EricJoy2048 commented 1 year ago

Hi, please ensure the $SEATUNNEL_HOME/lib/ dir have this jars:

image

You can build the project from source code, and then you can get them in seatunnel-dist/target/xxxx-bin.tar.gz. And you can download them from network too.

EricJoy2048 commented 1 year ago

whne I add these jar to plugins,I got new error: 2023-03-24 11:36:34,319 INFO org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job SeaTunnel (691490536757919745), Pipeline: [(1/1)], task: [HdfsFile-SourceTask (1/1)] turn to end state FAILED. 2023-03-24 11:36:34,319 ERROR org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job SeaTunnel (691490536757919745), Pipeline: [(1/1)], task: [HdfsFile-SourceTask (1/1)] end with state FAILED and Exception: java.lang.NoSuchMethodError: org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(Ljava/util/concurrent/ExecutorService;IZ)V at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:824) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1057) at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getOutputStream(FileSystemUtils.java:71) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.getOrCreateOutputStream(TextWriteStrategy.java:114) at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.write(TextWriteStrategy.java:75) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:108) at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43) at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:161) at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:57) at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39) at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27) at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.handleRecord(IntermediateQueueFlowLifeCycle.java:77) at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:58) at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:65) at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:157) at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:71) at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:357) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) image

This is jar conflict, you need use the hadoop jar shard by SeaTunnel named seatunnel-hadoop3-3.1.4-uber.jar.

EricJoy2048 commented 1 year ago

I suggest you use SeaTunnel 2.3.1 version , and there are more improve about S3File Sink and the document is clearer analysis https://seatunnel.incubator.apache.org/docs/connector-v2/sink/S3File

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

YuriyGavrilov commented 9 months ago

Hi just got this error in 2.3.3 version with all suggested libs

2023-11-30 00:27:16,141 INFO  org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand - Closed HazelcastInstance ......
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/statistics/IOStatisticsSource
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1027)
    at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
    at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
    at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:580)
    at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClassWithoutExceptionHandling(SeaTunnelBaseClassLoader.java:56)
    at org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader.loadClassWithoutExceptionHandling(SeaTunnelChildFirstClassLoader.java:86)
    at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClass(SeaTunnelBaseClassLoader.java:47)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:526)
    at java.base/java.lang.Class.forName0(Native Method)
    at java.base/java.lang.Class.forName(Class.java:534)
    at java.base/java.lang.Class.forName(Class.java:513)
    at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2575)
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2540)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2636)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3269)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3301)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:227)
    at org.apache.seatunnel.connectors.seatunnel.file.source.reader.AbstractReadStrategy.getFileNamesByPath(AbstractReadStrategy.java:127)
    at org.apache.seatunnel.connectors.seatunnel.file.s3.source.S3FileSource.prepare(S3FileSource.java:72)
    at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSource(JobConfigParser.java:85)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:317)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:179)
    at org.apache.seatunnel.engine.core.job.AbstractJobEnvironment.getLogicalDag(AbstractJobEnvironment.java:109)
    at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:73)
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:143)
    at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
    at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.statistics.IOStatisticsSource
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:526)
    ... 36 more
YuriyGavrilov commented 9 months ago

it working only with this version

Снимок экрана 2023-11-30 в 02 01 56

thanks @EricJoy2048

YuriyGavrilov commented 9 months ago

I understood how to load data from s3 to local file but receiving an error when trying to copy from s3 to s3 with same config to another folder with cutting columns option sink_columns = ["a","b"]. So receiving errors

2023-12-12 21:20:20 2023-12-12 18:20:20,797 WARN  org.apache.seatunnel.engine.server.TaskExecutionService - [localhost]:5801 [seatunnel-86182] [5.1] Exception in org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask@6668d0db
2023-12-12 21:20:20 java.lang.NoSuchMethodError: org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(Ljava/util/concurrent/ExecutorService;IZ)V
2023-12-12 21:20:20     at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:824) ~[hadoop-aws-3.2.4.jar:?]
2023-12-12 21:20:20     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118) ~[seatunnel-hadoop3-3.1.4-uber-2.3.1-optional.jar:2.3.1]
2023-12-12 21:20:20     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098) ~[seatunnel-hadoop3-3.1.4-uber-2.3.1-optional.jar:2.3.1]
2023-12-12 21:20:20     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1057) ~[seatunnel-hadoop3-3.1.4-uber-2.3.1-optional.jar:2.3.1]
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getOutputStream(FileSystemUtils.java:108) ~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.getOrCreateOutputStream(TextWriteStrategy.java:138) ~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.write(TextWriteStrategy.java:81) ~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:126) ~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43) ~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:227) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:61) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:76) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:51) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:52) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78) ~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:613) [seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_342]
2023-12-12 21:20:20     at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_342]
2023-12-12 21:20:20     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_342]
2023-12-12 21:20:20     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_342]
2023-12-12 21:20:20     at java.lang.Thread.run(Thread.java:750) [?:1.8.0_342]

Next error

2023-12-12 21:20:20 2023-12-12 18:20:20,809 ERROR org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job SeaTunnel_Job (787020937437380609), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-S3File-default-identifier]-SourceTask (1/1)] end with state FAILED and Exception: java.lang.NoSuchMethodError: org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(Ljava/util/concurrent/ExecutorService;IZ)V
2023-12-12 21:20:20     at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:824)
2023-12-12 21:20:20     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)
2023-12-12 21:20:20     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)
2023-12-12 21:20:20     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1057)
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getOutputStream(FileSystemUtils.java:108)
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.getOrCreateOutputStream(TextWriteStrategy.java:138)
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.write(TextWriteStrategy.java:81)
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:126)
2023-12-12 21:20:20     at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:227)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:61)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:76)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:51)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:52)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
2023-12-12 21:20:20     at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:613)
2023-12-12 21:20:20     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
2023-12-12 21:20:20     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
2023-12-12 21:20:20     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
2023-12-12 21:20:20     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
2023-12-12 21:20:20     at java.lang.Thread.run(Thread.java:750)
2023-12-12 21:20:20 
2023-12-12 21:20:20 2023-12-12 18:20:20,809 ERROR org.apache.seatunnel.engine.server.dag.physical.SubPlan - Task TaskGroupLocation{jobId=787020937437380609, pipelineId=1, taskGroupId=50000} Failed in Job SeaTunnel_Job (787020937437380609), Pipeline: [(1/1)], Begin to cancel other tasks in this pipeline.

for this config:

2023-12-12 21:20:13     "env" : {
2023-12-12 21:20:13         "execution.parallelism" : 1,
2023-12-12 21:20:13         "job.mode" : "BATCH"
2023-12-12 21:20:13     },
2023-12-12 21:20:13     "source" : [
2023-12-12 21:20:13         {
2023-12-12 21:20:13             "bucket" : "s3a://test",
2023-12-12 21:20:13             "path" : "/seatunnel/",
2023-12-12 21:20:13             "secret_key" : "XXXXXX",
2023-12-12 21:20:13             "file_format_type" : "parquet",
2023-12-12 21:20:13             "access_key" : "XXXXXXX",
2023-12-12 21:20:13             "fs.s3a.aws.credentials.provider" : "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider",
2023-12-12 21:20:13             "plugin_name" : "S3File",
2023-12-12 21:20:13             "fs.s3a.endpoint" : "gateway.storjshare.io"
2023-12-12 21:20:13         }
2023-12-12 21:20:13     ],
2023-12-12 21:20:13     "transform" : [],
2023-12-12 21:20:13     "sink" : [
2023-12-12 21:20:13         {
2023-12-12 21:20:13             "bucket" : "s3a://test",
2023-12-12 21:20:13             "path" : "/seatunnel2/",
2023-12-12 21:20:13             "secret_key" : "XXXXX",
2023-12-12 21:20:13             "access_key" : "XXXXX",
2023-12-12 21:20:13             "fs.s3a.aws.credentials.provider" : "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider",
2023-12-12 21:20:13             "plugin_name" : "S3File",
2023-12-12 21:20:13             "fs.s3a.endpoint" : "gateway.storjshare.io",
2023-12-12 21:20:13             "sink_columns" : [
2023-12-12 21:20:13                 "a",
2023-12-12 21:20:13                 "b"
2023-12-12 21:20:13             ]
2023-12-12 21:20:13         }
2023-12-12 21:20:13     ]
2023-12-12 21:20:13 }