apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.32k stars 2.41k forks source link

Getting error while connecting to Hudi(CLI) 0.14.0 tables. #10249

Open jjjigar opened 9 months ago

jjjigar commented 9 months ago

Describe the problem you faced Getting error after configuring the Hudi CLI 0.14.0.

We are using the below cli command to connect hudi tables

Command line used: connect --path s3a://{our bucketname}/{our tablename}

java.lang.NoClassDefFoundError: org/apache/hadoop/fs/store/EtagChecksum at java.lang.Class.forName0(Native Method) ~[?:1.8.0_382] at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_382] at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2362) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2327) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2423) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3213) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3245) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3296) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3264) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356) ~[hadoop-common-2.10.1.jar:?] at org.apache.hudi.common.fs.FSUtils.getFs(FSUtils.java:116) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.getFs(HoodieTableMetaClient.java:308) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.(HoodieTableMetaClient.java:139) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:692) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:85) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:774) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.cli.HoodieCLI.refreshTableMetadata(HoodieCLI.java:89) ~[hudi-cli-0.14.0.jar:0.14.0] at org.apache.hudi.cli.HoodieCLI.connectTo(HoodieCLI.java:95) ~[hudi-cli-0.14.0.jar:0.14.0] at org.apache.hudi.cli.commands.TableCommand.connect(TableCommand.java:87) ~[hudi-cli-0.14.0.jar:0.14.0] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_382] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_382] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_382] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382] at org.springframework.shell.command.invocation.InvocableShellMethod.doInvoke(InvocableShellMethod.java:306) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.command.invocation.InvocableShellMethod.invoke(InvocableShellMethod.java:232) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.command.CommandExecution$DefaultCommandExecution.evaluate(CommandExecution.java:158) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.Shell.evaluate(Shell.java:208) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.Shell.run(Shell.java:140) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.jline.InteractiveShellRunner.run(InteractiveShellRunner.java:73) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.DefaultShellApplicationRunner.run(DefaultShellApplicationRunner.java:65) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:762) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295) ~[spring-boot-2.7.3.jar:2.7.3] at org.apache.hudi.cli.Main.main(Main.java:34) ~[hudi-cli-0.14.0.jar:0.14.0] Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.store.EtagChecksum at java.net.URLClassLoader.findClass(URLClassLoader.java:387) ~[?:1.8.0_382] at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_382] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_382] at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_382]

Steps to reproduce the behavior: Configured Hudi 0.14.0 CLI using the below steps:

  1. Install Java 8 (open JDK 1.8) on the EC2 instance and set $JAVA_HOME environment variable
  2. sudo -H python3 -m pip install pyspark==3.4.1
  3. Set $SPARK_HOME environment variable on the EC2 instance
  4. export SPARK_HOME=/usr/local/lib/python3.8/dist-packages/pyspark
  5. Downloaded Hudi CLI 0.14.0 version listed here
  6. After downloading and unzipping the cd in to the "hudi-release-0.14.0" and run "mvn -T 2C clean package -DskipTests -Dspark3.4 -Dscala-2.12".
  7. Once the 6 step is completed and it has build succefully, go to hudi-release-0.13.0/hudi-cli path and wget the hadoop-aws-3.2.2.jar & aws-java-sdk-bundle-1.12.180.jar Client jars.
  8. export CLIENT_JAR=/home/ubuntu/hudi-release-0.13.0/hudi-cli/aws-java-sdk-bundle-1.12.180.jar:/home/ubuntu/hudi-release-0.13.0/hudi-cli/hadoop-aws-3.2.2.jar export HOODIE_ENV_fs_DOT_s3a_DOT_impl=org.apache.hadoop.fs.s3a.S3AFileSystem export HOODIE_ENV_fs_DOT_s3a_DOT_aws_DOT_credentials_DOT_provider=com.amazonaws.auth.InstanceProfileCredentialsProvider,com.amazonaws.auth.DefaultAWSCredentialsProviderChain
    export HOODIE_ENV_fs_DOT_AbstractFileSystem_DOT_s3a_DOT_impl=org.apache.hadoop.fs.s3a.S3A
  9. Copy the client jars - /home/ubuntu/hudi-release-0.13.0/hudi-cli/aws-java-sdk-bundle-1.12.180.jar and /home/ubuntu/hudi-release-0.13.0/hudi-cli/hadoop-aws-3.2.2.jar to $SPARK_HOME/jars path
  10. Inside hudi-release-0.14.0/hudi-cli path, run command ./hudi-cli.sh to start Hudi CLI.
  11. The console will prompt hudi→ to run commands on Hudi.
  12. Finally we should be able to load Hudi 0.14.0 tables using "connect --path s3a:///" command.

Expected behavior Command "connect --path s3a:///" should load tables without any error.

Environment Description

Stacktrace java.lang.NoClassDefFoundError: org/apache/hadoop/fs/store/EtagChecksum at java.lang.Class.forName0(Native Method) ~[?:1.8.0_382] at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_382] at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2362) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2327) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2423) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3213) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3245) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3296) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3264) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475) ~[hadoop-common-2.10.1.jar:?] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356) ~[hadoop-common-2.10.1.jar:?] at org.apache.hudi.common.fs.FSUtils.getFs(FSUtils.java:116) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.getFs(HoodieTableMetaClient.java:308) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.(HoodieTableMetaClient.java:139) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:692) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:85) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:774) ~[hudi-common-0.14.0.jar:0.14.0] at org.apache.hudi.cli.HoodieCLI.refreshTableMetadata(HoodieCLI.java:89) ~[hudi-cli-0.14.0.jar:0.14.0] at org.apache.hudi.cli.HoodieCLI.connectTo(HoodieCLI.java:95) ~[hudi-cli-0.14.0.jar:0.14.0] at org.apache.hudi.cli.commands.TableCommand.connect(TableCommand.java:87) ~[hudi-cli-0.14.0.jar:0.14.0] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_382] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_382] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_382] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382] at org.springframework.shell.command.invocation.InvocableShellMethod.doInvoke(InvocableShellMethod.java:306) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.command.invocation.InvocableShellMethod.invoke(InvocableShellMethod.java:232) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.command.CommandExecution$DefaultCommandExecution.evaluate(CommandExecution.java:158) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.Shell.evaluate(Shell.java:208) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.Shell.run(Shell.java:140) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.jline.InteractiveShellRunner.run(InteractiveShellRunner.java:73) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.shell.DefaultShellApplicationRunner.run(DefaultShellApplicationRunner.java:65) ~[spring-shell-core-2.1.1.jar:2.1.1] at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:762) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306) ~[spring-boot-2.7.3.jar:2.7.3] at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295) ~[spring-boot-2.7.3.jar:2.7.3] at org.apache.hudi.cli.Main.main(Main.java:34) ~[hudi-cli-0.14.0.jar:0.14.0] Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.store.EtagChecksum at java.net.URLClassLoader.findClass(URLClassLoader.java:387) ~[?:1.8.0_382] at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_382] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_382] at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_382]

ad1happy2go commented 9 months ago

@jjjigar I also got this error while triaging some other issue mentioned here - https://github.com/apache/hudi/issues/9903#issuecomment-1811830545

Will try to fix the setup.

jjjigar commented 8 months ago

Do we have any update?

jjjigar commented 8 months ago

Any update please?

ad1happy2go commented 8 months ago

@jjjigar Sorry for the delay here. I will work on this in this week.