Open SamarthRaval opened 5 months ago
While trying to do insert operation after bulk-insert, ran into above error.
Not sure what to do here ?
@xushiyan @ad1happy2go @bhasudha
Could you please help me here thank you.
Did you try to use the drop column
statement?
@SamarthRaval hoodie.datasource.write.reconcile.schema should ideally handle that,. can you try removing hoodie.write.set.null.for.missing.columns.
@SamarthRaval Let's try to reproduce with sample dataset if possible.
@SamarthRaval hoodie.datasource.write.reconcile.schema should ideally handle that,. can you try removing hoodie.write.set.null.for.missing.columns.
yes actually it should handle it, even if I have few columns missing from writeSchema.
Problem is for other customer it does work with same configurations, with no problem at all.
@SamarthRaval hoodie.datasource.write.reconcile.schema should ideally handle that,. can you try removing hoodie.write.set.null.for.missing.columns.
I tried to follow this
Did you try to use the
drop column
statement?
No I am not dropping any column but when checked closing there are some columns which are missing, but shouldn't it automatically take care of it as per hoodie.datasource.write.reconcile.schema
@SamarthRaval Let's try to reproduce with sample dataset if possible.
Hello @ad1happy2go @danny0405
I was able to reproduce this in research, and was able to get exact same error.
In research with in-between column is missing, it throws above error.
My understanding was with reconcile.schema enabled, it will just populated null for missing column, but seems this not the case.
Any idea with this ?
Describe the problem you faced
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
Hudi version : 0.14.0
Spark version : 3.4.1
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) : s3
Running on Docker? (yes/no) : No
Additional context
Add any other context about the problem here.
Stacktrace