Open obiwan866 opened 10 years ago
Hi, most people use the serde with textfile, since usually it's to ingest data from/to external entities that usually do not know sequence files. For this reason, I think I've never tested it with sequence files. However, it should still work if the sequencefile was created correctly. Can you send me a sample sequencefile you're using ? I'll have a look and test it.
Hi Roberto, Thank you for your quick answer. I join a test file. I hope it will be useful.
Have a good day.
2014-11-04 20:59 GMT+01:00 Roberto Congiu notifications@github.com:
Hi, most people use the serde with textfile, since usually it's to ingest data from/to external entities that usually do not know sequence files. For this reason, I think I've never tested it with sequence files. However, it should still work if the sequencefile was created correctly. Can you send me a sample sequencefile you're using ? I'll have a look and test it.
— Reply to this email directly or view it on GitHub https://github.com/rcongiu/Hive-JSON-Serde/issues/94#issuecomment-61704049 .
Frédéric
Was this ever resolved? I'm having the same issue.
I also am having the same issue. I can send the json string or the file. Just let me know where.
can you email to rcongiu@yahoo.com ?
Hi, if there is any improvment about this serde, i would be interested in !
Have a good day.
2016-05-12 15:56 GMT+02:00 Roberto Congiu notifications@github.com:
can you email to rcongiu@yahoo.com ?
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/rcongiu/Hive-JSON-Serde/issues/94#issuecomment-218764578
Frédéric
Im getting the exact same thing using a sequence file.
I got past this with
PARTITIONED BY ( date string) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.SequenceFileAsTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat' LOCATION 'hdfs://nameservice1/mydata'
So it seems that SequenceFileAsTextInputFormat avoids the exception, but I get nothing but nulls. I suspect that the string is some sort of byte string and not the JSON
Thanks for the library. Have encountered same issue with Presto for BytesWritable as Sequencefile value. Hope this PR help fix the same. Let me know incase any.
Hi, First i would like to thank you for your great job. Here is my problem : i'm using hive 0.12 on cdh 5.0 and i'm trying to create a table from a sequence file containing json.
CREATE EXTERNAL TABLE Test_Json ( url STRING, Ts TIMESTAMP, SESSIONID STRING, PARAMS STRING, CONTEXT STRUCT <USERNAME : STRING, LASTNAME : STRING, FORENAME : STRING, LINE : STRING, STREET : STRING, CITY : STRING, DPT : STRING, REGION : STRING> ) COMMENT 'foo' PARTITIONED BY (date STRING) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' STORED AS SEQUENCEFILE LOCATION '/data/foo';
The i add a partition and when i'm trying to watch it i get this error : java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.Text.
If i get my data as text file there is no issue, and i'm able to select all the fields i want (especially in the struct field).
I wonder if i'm doing something bad ?
Thanks in advance.