Closed badion closed 4 years ago
As a note, Hudi 0.5.2, was packaged from master 1 day ago
@badion This does seem directly related to the complex types issue fixed recently.. 0.5.1-2 we moved out of databricks-avro and to spark-avro and this seems like a miss.
Are you interested in a custom patch for this on top of 0.5.2? Not sure I follow the last sentence.. Please clarify, happy to get this moving along for you..
cc @umehrot2 @zhedoubushishi as well to chime in
@vinothchandar Seems like issue gone after building .jar file from commit(merge) - ce0a4c64d07d6eea926d1bfb92b69ae387b88f50, which was apparently after release of Hudi release 0.5.2. Also one thing that we tried to use hudi jar from mvn central, it seems like it doesn't have fix with avro yet.
I think will will wait next release, which will include those changes.
@badion yeah the fix for this did not make it to 0.5.2. You can either build your custom Hudi with this patch applied on top of 0.5.2 or wait until next release.
Closing this issue as it will be resolved in next release.
First thanks for the great lib, that reduces complexity of our ETL pipelines massively!
Is the next release date in the near future? I'm asking because the latest release contains this existential bug that causes the library to simply not work. Currently I'm evaluating this as alternative to delta lake and reached the point of this issue pretty fast. Is it possible to release a hotfix that at new users are able to start working with this lib by following the getting started section and start to implement more complex data models?
@rolandjohann Thanks for the feedback.. We are trying to bundle few more such fixes and release 0.6.0 later this month... backporting some fixes alone on 0.5.2 and doing a 0.5.3 may make sense though.. Let me bring this up with the community and see what everyone feels..
Hi, any updates on when would this be released and rolled out?
@nsivabalan is driving the release.. We are planning to do a 0.5.3 this week. right siva ? This release will have the fix.. @nikitap95 if interested, you can join the mailing list and help validate the release candidate :)
@vinothchandar Thanks for your prompt response. Will wait for the release in that case rather than using the patch. Sure, I'll get myself added to it, would be great to be a part of it!
yes, I should have a candidate up for voting by today or tomorrow.
Currenlty we are working with Hudi 0.5.0 and AWS Glue, everything working fine for .parquet and COW mode, with complex types in data and different nullable options.
After switching to Hudi 0.5.2 , start facing the issues related to:
https://github.com/apache/incubator-hudi/pull/1406
Spark application fails while writing Dataframe into Hudi table when using complex types like:
And having nullable fields = true for it. Till the moment of saving, everything is fine, and we are able to see complete dataframe:
Note that All simple types working fine with saving data into Hudi table, as well as complex types using nullable = false
Steps to reproduce the behavior:
Expected behavior Hudi table should be successfully saved in parquet format with complex type fields, which contains nullable = true. Hudi 0.5.0 working fine with all variety of complex types and nullable fields.
Local/AWS Glue 1.0:
Stacktrace
...
...
Is this is already a known issue for Hudi greater 0.5.0? if there is a workaround that would allow us to upgrade to 0.5.2?