rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 393 forks source link

ClassCastException String to Timestamp when doing select date field in Hive Query #61

Closed appanasatya closed 10 years ago

appanasatya commented 10 years ago

For the following JSON, { "id":123, "to_date":"2014-03-20 13:23:16" }

CREATE EXTERNAL TABLE my_table(data_info struct <id bigint, to_date timestamp> ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION '/tmp/my_table';

Select data_info.to_date from my_table;

I am getting the following ClassCast Exception

java.lang.ClassCastException: java.lang.String cannot be cast to java.sql.Timestamp at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaTimestampObjectInspector.getPrimitiveWritableObject(JavaTimestampObjectInspector.java:33) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:236) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:486) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:439) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:423) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:658) ... 9 more

rcongiu commented 10 years ago

Hi, the json and/or the table definition are incorrect.

to define a to_date struct, you should use

CREATE EXTERNAL TABLE my_table( data_info struct<id:string,to_date:timestamp> )
ROW FORMAT SERDE 
'org.openx.data.jsonserde.JsonSerDe'
LOCATION 
'/tmp/my_table';

and the json should look like

{ data_info: {"id":123, "to_date":"2014-03-20 13:23:16" }}

That works fine for me:

hive> select data_info.to_date from my_table;
Total MapReduce jobs = 1
.....
OK
2014-03-20 13:23:16

The code you posted fails even when creating the table, since there's no definition for the struct.

rcongiu commented 10 years ago

This also works, using the json the way you posted it:

{"id":123, "to_date":"2014-03-20 13:23:16" }

You can create the table like:

CREATE EXTERNAL TABLE my_table( id string, to_date timestamp )
ROW FORMAT SERDE 
'org.openx.data.jsonserde.JsonSerDe'
LOCATION 
'/tmp/my_table';

hive> select to_date from my_table;
Total MapReduce jobs = 1
...
Total MapReduce CPU Time Spent: 590 msec
OK
2014-03-20 13:23:16
Time taken: 9.262 seconds
hive> 
appanasatya commented 10 years ago

By the way, I mean the same as you replied. I did the same as you suggested here, Still it is giving me the class cast exception (String to Timestamp).

Which version of Hive and rcongiu serde are you using ?? I am using hive-0.10.0 and rcongui -json-serde-1.1.9.2-SNAPSHOT-jar-with-dependencies.jar

appanasatya commented 10 years ago

I checked out latest code and it is working fine. Thanks a lot.