Open numberlabs-developers opened 10 months ago
torvalds.dev is analyzing the ticket
torvalds.dev is analyzing the ticket
The issue at hand is understanding the priority of parameter settings in Apache Hudi when using Spark on Hive. The parameters in question are:
tblproperties
sql / data source set
hudi-defaults.conf
The environment in which this issue is occurring is described as follows:
The file provided, hudi-common/src/main/resources/hbase-site.xml
, is an XML configuration file for HBase, a part of the Hadoop ecosystem that provides random, real-time read/write access to your Big Data.
Here is a plan to understand the issue and suggest relevant actions:
Understand the parameters: The first step is to understand what each of these parameters does. tblproperties
are table-level settings in Hive, sql / data source set
likely refers to settings in the SQL query or data source API when writing data into Hudi, and hudi-defaults.conf
is likely a configuration file for Hudi.
Check the Hudi documentation: The Hudi documentation may provide information on the priority of these settings. If not, it may be necessary to check the source code or ask the Hudi community.
Experiment: If the documentation and source code do not provide a clear answer, it may be necessary to experiment by setting conflicting settings and seeing which one takes effect.
Check the hbase-site.xml
file: This file may contain settings that affect the behavior of Hudi. Check this file for any settings that may be relevant to the issue.
Ask the Hudi community: If all else fails, it may be necessary to ask the Hudi community for help. They may have encountered this issue before and can provide guidance.
Relevant files to the issue:
hudi-defaults.conf
: This is the Hudi configuration file and may contain settings that are relevant to the issue.hudi-common/src/main/resources/hbase-site.xml
: This is an HBase configuration file and may contain settings that affect the behavior of Hudi.
Using Spark On Hive.
What is the priority for the following three parameter settings to take effect ?
1、tblproperties 2、sql / data source set 3、hudi-defaults.conf
Environment Description
Hudi version : 0.12.1
Spark version : 3.1.3
Hive version : 3.1.0
Hadoop version : 3.1.1
Storage (HDFS/S3/GCS..) : HDFS
Running on Docker? (yes/no) : no