Open numberlabs-developers opened 10 months ago
torvalds.dev is analyzing the ticket
The user is asking about the priority of parameter settings in Apache Hudi when using Spark on Hive. The parameters in question are:
tblproperties
sql / data source set
hudi-defaults.conf
The user is using Hudi version 0.12.1, Spark version 3.1.3, Hive version 3.1.0, and Hadoop version 3.1.1. The storage is HDFS and it's not running on Docker.
The user also provided a hbase-site.xml
file which is a configuration file for HBase, a part of the Hadoop ecosystem. This file contains various properties and their values that are used to configure HBase.
Plan:
Understand the issue: The user wants to know the priority of the three parameter settings in Hudi. This is likely related to how Hudi will behave when there are conflicting settings in these parameters.
Suggest actions: The user should check the Hudi documentation or source code to understand how these parameters are prioritized. If the documentation is not clear, the user could also try setting conflicting parameters and observe the behavior to determine the priority.
Relevant files: The hbase-site.xml
file provided by the user is a configuration file for HBase, not Hudi. It's not clear how this file is related to the user's question about Hudi parameter settings. The user might need to look at Hudi configuration files or source code to understand the parameter priority. The hudi-defaults.conf
file mentioned in the question is likely one of these relevant files.
Using Spark On Hive.
What is the priority for the following three parameter settings to take effect ?
1、tblproperties 2、sql / data source set 3、hudi-defaults.conf
Environment Description
Hudi version : 0.12.1
Spark version : 3.1.3
Hive version : 3.1.0
Hadoop version : 3.1.1
Storage (HDFS/S3/GCS..) : HDFS
Running on Docker? (yes/no) : no