numberlabs-developers / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
0 stars 0 forks source link

[SUPPORT]What is the priority for the parameter settings of hudi to take effect #197

Open numberlabs-developers opened 10 months ago

numberlabs-developers commented 10 months ago

Using Spark On Hive.

What is the priority for the following three parameter settings to take effect ?

1、tblproperties 2、sql / data source set 3、hudi-defaults.conf

Environment Description

Hudi version : 0.12.1

Spark version : 3.1.3

Hive version : 3.1.0

Hadoop version : 3.1.1

Storage (HDFS/S3/GCS..) : HDFS

Running on Docker? (yes/no) : no

torvalds-dev[bot] commented 10 months ago

torvalds.dev is analyzing the ticket

torvalds-dev-testbot[bot] commented 10 months ago

torvalds.dev is analyzing the ticket

torvalds-dev[bot] commented 10 months ago

The user is asking about the priority of parameter settings in Apache Hudi when using Spark on Hive. The parameters in question are:

  1. tblproperties
  2. sql / data source set
  3. hudi-defaults.conf

The user's environment includes Hudi version 0.12.1, Spark version 3.1.3, Hive version 3.1.0, Hadoop version 3.1.1, and storage on HDFS. The user is not running on Docker.

The user also provided the content of a file named hbase-site.xml located in hudi-common/src/main/resources/. This file contains configuration settings for HBase, which is a part of the Hadoop ecosystem like Hive.

Plan:

  1. Understand the issue: The user wants to know the priority of the three parameter settings in Hudi. This is likely related to how Hudi applies these settings when it runs. The priority could affect the final configuration of a Hudi job, especially if there are conflicting settings in these parameters.

  2. Suggest actions: The user should check the Hudi documentation or source code to understand how it applies these settings. If the documentation is not clear, the user could also try setting conflicting parameters and see which one takes effect.

  3. Identify relevant files: The hbase-site.xml file provided by the user is likely relevant to the issue. This file contains HBase settings, which could affect Hudi's operation if Hudi uses HBase. Other potentially relevant files are the hudi-defaults.conf file mentioned by the user, and any files where tblproperties and sql / data source set parameters are set.