Open mohit10verma opened 1 year ago
I tried to fix this by making the following change in the hive implementation of StdUdfWrapper::wrap()
.
try {
Object hiveObject = hiveDeferredObject.get();
if (hiveObject != null) {
if (stdData instanceof StdStruct) {
if (hiveObject.getClass().isArray()) {
Object[] hiveObjects = (Object[]) hiveObject;
((PlatformData) stdData).setUnderlyingData(new ArrayList<>(Arrays.asList(hiveObjects)));
}
} else {
((PlatformData) stdData).setUnderlyingData(hiveObject);
}
return stdData;
} else {
return null;
}
} catch (HiveException e) {
throw new RuntimeException("Cannot extract Hive Object from Deferred Object");
}
But after the above fix, I ran into more class cast issues because HiveInteger
created in HiveFactory
is of type JavaIntObjectInspector
and that runs into trouble during serialization in Hive's FetchFormatter::SerDeUtils.toThriftPayload
-> SerDeUtils::buildJSONString
. This is because JavaIntObjectInspector::get
tries to cast IntWritable
to Integer
directly.
I'll leave it to the experts of this codebase to fix this issue /guide me.
Mohit, please share the minimum code and steps to reproduce the error locally. You can raise a PR in transport and add the steps in the PR description.
Hello, I posted all the details here already. To repro this issue, you can creat a new UDF in transport-udfs-examples
module and then write a unit test for it. The code for these are in the bug description.
The previous comment was my attempt to fix it but I believe it was not complete.
from the error trace, im not sure which line number is throwing the error. For ex: it would help to know what is StructElementIncrementByOneFunction.java:33
Can you refer the struct fields using names and not index? Ex:
((StdInteger) myStruct.getField("int_field")).get()
Created a PR to repro the issue: https://github.com/linkedin/transport/pull/123/files
I created a simple UDF and a unit test for it in the
transport-udfs-examples
module. The UDF increments the first integer field of a struct by 1. This is the UDF and the unit test:A unit test for it:
The generated spark_2.11 and spark_2.12 UDFs run correctly. But the generated hive artifact fails with the below exception (I am running the gradle task
./gradlew hiveTask
for testing).I don't have any local changes. Am I doing something wrong or is this something to fix?