rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 393 forks source link

Serde throwing classCastException when using Max Function for complex Struct in Hive Queries #67

Closed appanasatya closed 10 years ago

appanasatya commented 10 years ago

Hive's Max Feature for complex struct: https://issues.apache.org/jira/browse/HIVE-1128

Fyi: I am using json-serde-1.1.9.3-SNAPSHOT-jar-with-dependencies.jar;

CREATE EXTERNAL TABLE my_table( data_info structid:bigint,to_date:timestamp,name:string,age:int , class_standard int) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION '/tmp/my_table';

select max(struct(data_info.id,data_info)).col2 as data_information,class_standard from my_table group_by class_standard;

The above Query Throwing ClassCast Exception java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:39) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:622) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:682) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:572) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$GenericUDAFMaxEvaluator.merge(GenericUDAFMax.java:109) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$GenericUDAFMaxEvaluator.iterate(GenericUDAFMax.java:96) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:658) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:854) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:751) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:819)

kashyap-fk commented 10 years ago

Something even simpler isn't working.

CREATE EXTERNAL TABLE demo_table (
  foo bigint COMMENT 'from deserializer', 
  bar string COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
LOCATION 
'/tmp/my_table';
hive> select max(foo) from demo_table;
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"foo":1398264180984,"bar":"01fb6d96-2047-4662-aca9-6fe04509966e"}
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"foo":1398264180984,"bar":"01fb6d96-2047-4662-aca9-6fe04509966e"}
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
    ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:824)
    at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
    at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
    at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
    ... 9 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long
    at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:39)
    at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:622)
    at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:572)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$GenericUDAFMaxEvaluator.merge(GenericUDAFMax.java:109)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$GenericUDAFMaxEvaluator.iterate(GenericUDAFMax.java:96)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:658)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:854)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:751)
    at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:819)
    ... 18 more
rcongiu commented 10 years ago

What version of hive are you using ? It may be indeed a hive bug, if the  MAX operator picks the higher value, but not its associated objectinspector.

 

"Good judgment comes from experience.

Experience comes from bad judgment"

Data Engineer - OpenX.org Pasadena, CA Skype: sardodazione Y! IM: rcongiu

On Friday, May 2, 2014 6:29 AM, Kashyap Paidimarri notifications@github.com wrote:

Something even simpler isn't working.

CREATE EXTERNAL TABLE demo_table ( foo bigint COMMENT 'from deserializer', bar string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION '/tmp/my_table'; hive> select max(foo) from demo_table; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"foo":1398264180984,"bar":"01fb6d96-2047-4662-aca9-6 fe04509966e"} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"foo":1398264180984,"bar":"01fb6d96-2047-4662-aca9-6fe04509966e"} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:824) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546) ... 9 more Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:39) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:622) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:572) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$GenericUDAFMaxEvaluator.merge(GenericUDAFMax.java:109) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$GenericUDAFMaxEvaluator.iterate(GenericUDAFMax.java:96) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:658) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:854) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:751) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:819) ... 18 more — Reply to this email directly or view it on GitHub.

appanasatya commented 10 years ago

Hive Version: 0.10.0-cdh4.2.0

appanasatya commented 10 years ago

This Max function for complex struct is working with other serdes of Hive(but they are not Json Serdes).

kashyap-fk commented 10 years ago

We (me and @appanasatya) fixed this yesterday. Sent you pull request #68

rcongiu commented 10 years ago

Thanks! Merged.

 

"Good judgment comes from experience.

Experience comes from bad judgment"

Data Engineer - OpenX.org Pasadena, CA Skype: sardodazione Y! IM: rcongiu

On Tuesday, May 6, 2014 3:05 AM, Kashyap Paidimarri notifications@github.com wrote:

We (me and @appanasatya) fixed this yesterday. Sent you pull request #68

— Reply to this email directly or view it on GitHub.