exasol / cloud-storage-extension

Exasol Cloud Storage Extension for accessing formatted data Avro, Orc and Parquet, on public cloud storage systems
MIT License
7 stars 11 forks source link

No such method error when using HDFS #219

Closed morazow closed 1 year ago

morazow commented 1 year ago

Situation

When importing or exporting from HDFS, we are getting this error:

[2022-09-09 19:21:43] [22002] VM error: F-UDF-CL-LIB-1127: F-UDF-CL-SL-JAVA-1002: F-UDF-CL-SL-JAVA-1013:
[2022-09-09 19:21:43] com.exasol.ExaUDFException: F-UDF-CL-SL-JAVA-1080: Exception during run
[2022-09-09 19:21:43] java.lang.NoSuchMethodError: 'org.apache.htrace.core.Tracer org.apache.hadoop.fs.FsTracer.get(org.apache.hadoop.conf.Configuration)'
[2022-09-09 19:21:43] org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:323)
[2022-09-09 19:21:43] org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:308)
[2022-09-09 19:21:43] org.apache.hadoop.hdfs.DistributedFileSystem.initDFSClient(DistributedFileSystem.java:201)
[2022-09-09 19:21:43] org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:186)
[2022-09-09 19:21:43] org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
[2022-09-09 19:21:43] org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
[2022-09-09 19:21:43] org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
[2022-09-09 19:21:43] org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
[2022-09-09 19:21:43] org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
[2022-09-09 19:21:43] com.exasol.cloudetl.bucket.Bucket.fileSystem$lzycompute(Bucket.scala:70)
[2022-09-09 19:21:43] com.exasol.cloudetl.bucket.Bucket.fileSystem(Bucket.scala:69)
[2022-09-09 19:21:43] com.exasol.cloudetl.bucket.Bucket.getPaths(Bucket.scala:79)
[2022-09-09 19:21:43] com.exasol.cloudetl.emitter.FilesMetadataEmitter.<init>(FilesMetadataEmitter.scala:27)
[2022-09-09 19:21:43] com.exasol.cloudetl.scriptclasses.FilesMetadataReader$.run(FilesMetadataReader.scala:33)
[2022-09-09 19:21:43] com.exasol.cloudetl.scriptclasses.FilesMetadataReader.run(FilesMetadataReader.scala)
[2022-09-09 19:21:43] com.exasol.ExaWrapper.run(ExaWrapper.java:214)
[2022-09-09 19:21:43] (Session: 1743496698947633152)

Reason

The reason is that the hdfs-client version is older one from the expected one. We can see it when listing the dependency tree.

[INFO] +- org.apache.hadoop:hadoop-hdfs:jar:3.3.4:compile
[INFO] |  +- commons-daemon:commons-daemon:jar:1.0.13:compile
[INFO] |  \- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile
[INFO] +- org.alluxio:alluxio-core-client-hdfs:jar:2.8.1:compile
[INFO] |  +- org.apache.hadoop:hadoop-client:jar:3.3.1:compile
[INFO] |  |  +- org.apache.hadoop:hadoop-hdfs-client:jar:3.3.1:compile
[INFO] |  |  |  \- com.squareup.okhttp:okhttp:jar:2.7.5:compile
[INFO] |  |  |     \- com.squareup.okio:okio:jar:1.6.0:compile

The expected Hadoop version is 3.3.4, but the HDFS client version is 3.3.1 which comes transitively from the Alluxio dependency.

Acceptance Criteria