Open mzheng-plaid opened 1 week ago
Nice findings, @ad1happy2go , is this an known issue for upgrade?
@mzheng-plaid Normally we should not do upgrade directly from such a low version. If everything worked fine when going to 0.12 first and then 0.14, then we should be good.
@ad1happy2go what do you mean? Do I need to manually upgrade from 2->3->4->5?
My point in this ticket is that 2->5 did not work and broke the table by changing the key generator class
Describe the problem you faced
We found some old tables that were still table version
2
, these failed when trying to be read with Hudi 0.14.1 Spark jobs (could not recognize thehoodie.properties
file).To remediate I created a Spark job running with Hudi 0.12.2 so that I could run the
upgrade table
command to upgrade from table version 2 to table version 5 (and then in ingestion this would be upgraded to table version 6 on Hudi 0.14.1)I was surprised to see the key generator class was changed/broken by the CLI:
To Reproduce
Steps to reproduce the behavior:
Create a table on Hudi 0.9.0, then run
upgrade table
to table version 5.Expected behavior
The key generator should not be overwritten.
Environment Description
Hudi version : 0.12.2
Spark version : 3.3.1
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) :
Running on Docker? (yes/no) :
Additional context
This was done on EMR