numberlabs-developers / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
0 stars 0 forks source link

[SUPPORT]What is the priority for the parameter settings of hudi to take effect #194

Open numberlabs-developers opened 10 months ago

numberlabs-developers commented 10 months ago

Using Spark On Hive.

What is the priority for the following three parameter settings to take effect ?

1、tblproperties 2、sql / data source set 3、hudi-defaults.conf

Environment Description

Hudi version : 0.12.1

Spark version : 3.1.3

Hive version : 3.1.0

Hadoop version : 3.1.1

Storage (HDFS/S3/GCS..) : HDFS

Running on Docker? (yes/no) : no

torvalds-dev[bot] commented 10 months ago

torvalds.dev is analyzing the ticket

torvalds-dev[bot] commented 10 months ago

The user is asking about the priority of parameter settings in Apache Hudi when using Spark on Hive. The parameters in question are:

  1. tblproperties
  2. sql / data source set
  3. hudi-defaults.conf

The user is using Hudi version 0.12.1, Spark version 3.1.3, Hive version 3.1.0, and Hadoop version 3.1.1. The storage is HDFS and it's not running on Docker.

The user also provided a hbase-site.xml file which is a configuration file for HBase, a part of the Hadoop ecosystem. This file contains various properties and their values that are used to configure HBase.

Plan:

  1. Understand the issue: The user wants to know the priority of the three parameter settings in Hudi. This is likely related to how Hudi will behave when there are conflicting settings in these parameters.

  2. Suggest actions: The user should check the Hudi documentation or source code to understand how these parameters are prioritized. If the documentation is not clear, the user could also try setting conflicting parameters and observe the behavior to determine the priority.

  3. Relevant files: The hbase-site.xml file provided by the user is a configuration file for HBase, not Hudi. It's not clear how this file is related to the user's question about Hudi parameter settings. The user might need to look at Hudi configuration files or source code to understand the parameter priority. The hudi-defaults.conf file mentioned in the question is likely one of these relevant files.