Open JoshuaZhuCN opened 1 year ago
Yeah, the HoodieCatalogTable#initHoodieTable
does not copy the option hoodie.table.keygenerator.class
like what the sql writer does: HoodieSparkSqlWriter#mergeParamsAndGetHoodieConfig
, but I guess it is for design purpose, the write config can override the table config for writer path to init table, but not the case for catalog table creation/modification.
@nsivabalan Can you help double check this ?
Yeah, the
HoodieCatalogTable#initHoodieTable
does not copy the optionhoodie.table.keygenerator.class
like what the sql writer does:HoodieSparkSqlWriter#mergeParamsAndGetHoodieConfig
, but I guess it is for design purpose, the write config can override the table config for writer path to init table, but not the case for catalog table creation/modification.@nsivabalan Can you help double check this ?
If we set a value different from that in hoodie.properties when writing data, an error will be reported
Yeah, the
HoodieCatalogTable#initHoodieTable
does not copy the optionhoodie.table.keygenerator.class
like what the sql writer does:HoodieSparkSqlWriter#mergeParamsAndGetHoodieConfig
, but I guess it is for design purpose, the write config can override the table config for writer path to init table, but not the case for catalog table creation/modification. @nsivabalan Can you help double check this ?If we set a value different from that in hoodie.properties when writing data, an error will be reported
i get it, it need not to set this option any more in datasource write mode if ddl has set
@jonvex can you look into this pls?
https://issues.apache.org/jira/browse/HUDI-5262 I reported this a few weeks ago. You need to use hoodie.table.keygenerator.class
to set the keygenerator when creating a table in spark-sql
@xushiyan https://github.com/apache/hudi/pull/7394 here is an example pr. Instead of failing, another option is to set the correct config
@jonvex : can you fix our quick start guide around this please. do create a jira as well.
PR's are ready for review and then we can close this out
The keygenerator.class value set when using SparkSQL to create a table does not finally take effect in hoodie.properties
e.g: we want to set the value to 'ComplexKeyGenerator' but finally it is 'SimpleKeyGenerator' in hoodie.properties
To Reproduce
Steps to reproduce the behavior:
Environment Description
Hudi version : 0.12.1
Spark version : 3.1.3
Hive version : 3.1.1
Hadoop version : 3.1.0
Storage (HDFS/S3/GCS..) : HDFS
Running on Docker? (yes/no) : no