apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.32k stars 2.41k forks source link

[SUPPORT] Hudi CLI bundle not working #10566

Open CTTY opened 7 months ago

CTTY commented 7 months ago

Tips before filing an issue

Describe the problem you faced

When using hudi-cli-with-bundle.sh to start Hudi CLI, many commands would fail immediately due to error below:

24/01/26 00:22:51 INFO InputStreamConsumer: Error: Failed to load org.apache.hudi.cli.commands.SparkMain: org/apache/hudi/common/engine/HoodieEngineContext
24/01/26 00:22:51 INFO InputStreamConsumer: 24/01/26 00:22:51 INFO ShutdownHookManager: Shutdown hook called
24/01/26 00:22:51 INFO InputStreamConsumer: 24/01/26 00:22:51 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-5465e2a8-7b9a-4cf1-b46a-afec9f50d860
Failed to clean hoodie dataset

Note the error message is also very limited. It seems the stacktrace has been truncated

To Reproduce

Steps to reproduce the behavior:

  1. Create a Hudi table
  2. Configure Spark/Hadoop/Hudi classpath for CLI and start CLI with hudi-cli-with-bundle.sh
  3. Connect to Hudi table with command connect --path <table_path>
  4. Run cleans run

Expected behavior

I believe there are 2 problems here:

Environment Description

Additional context

Add any other context about the problem here.

Stacktrace

See above

ad1happy2go commented 7 months ago

@CTTY I was trying to reproduce this issue, but got into some other setup issue. Will get back to you soon on this.

mansipp commented 5 months ago

@ad1happy2go Getting the similar error while running the commit rollback, compaction scheduleAndExecute , compaction schedule and savepoint create

commit rollback --commit 20240408231846380
24/04/08 23:22:02 INFO InputStreamConsumer: Apr 08, 2024 11:22:02 PM org.apache.spark.launcher.Log4jHotPatchOption staticJavaAgentOption
24/04/08 23:22:02 INFO InputStreamConsumer: WARNING: spark.log4jHotPatch.enabled is set to true, but /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar does not exist at the configured location
24/04/08 23:22:02 INFO InputStreamConsumer:
24/04/08 23:22:03 INFO InputStreamConsumer: Error: Failed to load org.apache.hudi.cli.commands.SparkMain: org/apache/hudi/common/engine/HoodieEngineContext
24/04/08 23:22:03 INFO InputStreamConsumer: 24/04/08 23:22:03 INFO ShutdownHookManager: Shutdown hook called
24/04/08 23:22:03 INFO InputStreamConsumer: 24/04/08 23:22:03 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-272bb6ef-f858-42a6-b9d0-9614f1f36371
24/04/08 23:22:03 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3://<path>/
24/04/08 23:22:03 INFO HoodieTableConfig: Loading table properties from s3://<path>mansipp_hudi_mor_table_2/.hoodie/hoodie.properties
24/04/08 23:22:03 INFO S3NativeFileSystem: Opening 's3://<path>/.hoodie/hoodie.properties' for reading
24/04/08 23:22:03 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from s3://<path>/mansipp_hudi_mor_table_2/
Commit 20240408231846380 failed to roll back
compaction schedule —hoodieConfigs "hoodie.compact.inline.max.delta.commits=1"
24/04/08 23:34:34 INFO InputStreamConsumer: Apr 08, 2024 11:34:34 PM org.apache.spark.launcher.Log4jHotPatchOption staticJavaAgentOption
24/04/08 23:34:34 INFO InputStreamConsumer: WARNING: spark.log4jHotPatch.enabled is set to true, but /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar does not exist at the configured location
24/04/08 23:34:34 INFO InputStreamConsumer:
24/04/08 23:34:36 INFO InputStreamConsumer: Error: Failed to load org.apache.hudi.cli.commands.SparkMain: org/apache/hudi/common/engine/HoodieEngineContext
24/04/08 23:34:36 INFO InputStreamConsumer: 24/04/08 23:34:36 INFO ShutdownHookManager: Shutdown hook called
24/04/08 23:34:36 INFO InputStreamConsumer: 24/04/08 23:34:36 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-e553d601-6f57-4d2f-9543-da0bee777c41
Failed to run compaction for 20240408233433912