Closed cheqianh closed 2 years ago
We are currently removing null value from the top-level container here since Hive can't handle IonNull. However this code chunk doesn't work as expected - we need to keep IonNull values as primitive null instead of removing them from the container.
This is a good opportunity to remove unnecessary null check logic and rely on object inspectors to filter Ion null. (See Solution in the PR
bullet point 3 above. We should rely on object inspector to centrally filter null values to avoid potential human error. We should not rely on developers to handle all null types in different methods.)
An overview of changes including all below 5 commits is here.
1845a62 improved unit tests style for easier debugging.
d2bc9dc, 1fa0a61 and 9836da9 added null type check for Ion container object inspectors and their unit tests.
80f0f25 makes object inspector's get element methods also return primitive null for IonNull. I don't know where these methods will be used, but I kept them behavior the same as before to avoid potential issue.
Description:
This PR adds a null check for IonStructToMap object inspector to avoid NPE.
This PR is consisted of two commits, the first one (91736de) contains unit tests and the second one (a226202) includes the fix.
Issue:
Ion-hive-serde Struct to Map conversion throws NPE for null value.
More specifically, ion-path-extractor only matches the top-level container instead of Ion values within the nested container with default configuration. So the null values within the nested container are not filtered and are still treated as Ion null other than primitive null as we expect here (This is kind of like the case insensitive issue we saw earlier) so that Ion case insensitive decorator will later convert it back to primitive null here when iterate the struct. Therefor in the for loop, null.getFieldName() throws NPE.
Solution in this PR
Since Hive doesn't know how to parse Ion null so we need to add primitive null to the struct. Currently, we filter Ion null in deserializer before asking object inspector to do any conversion. But object inspector should be able to detect Ion null as it passed the final Map to upper layer (E.g. presto/hive). So In this PR, I added a Null check in object inspector to make sure all values are parsed correctly.
The solution follows three points below.
I added null check for Ion Map to Struct, once it's addressed. I'll add null check for other struct object inspectors.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.