rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 391 forks source link

Can I process multiline JSON data in HIVE ?? #201

Open debuggerrr opened 6 years ago

debuggerrr commented 6 years ago

I am able to create table and process data for single line JSON data but my question is can I process JSON data like below:

[{
    "field1": "data1",
    "field2": 100,
    "field3": "more data1",
    "field4": 123.001
}, {
    "field1": "data2",
    "field2": 200,
    "field3": "more data2",
    "field4": 123.002
}, {
    "field1": "data3",
    "field2": 300,
    "field3": "more data3",
    "field4": 123.003
}, {
    "field1": "data4",
    "field2": 400,
    "field3": "more data4",
    "field4": 123.004
}]

I have read that multiline JSON data wasn't supported in HIVE but can I use it now?? If yes, then please share the links where I can find this because I searched alot for this but I couldn't find any relevant material for this. Thanks in advance .

rcongiu commented 6 years ago

Do you have an actual file sample ? SerDe won't work unless there's one JSON record per line. R.

"Good judgment comes from experience. Experience comes from bad judgment"


On Sunday, November 12, 2017, 5:06:32 AM PST, debuggerrr <notifications@github.com> wrote:  

I want to load below JSON data and for that I am trying to create table as below:

CREATE TABLE my_table (field1 string, field2 int, field3 string, field4 double) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' ;

I added below JAR files :

I referred many links from stackoverflow as well as from github but of no use. Please help me with this. Let me know where I am going wrong.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

debuggerrr commented 6 years ago

I am able to create table and process data for single line JSON data but my question is can I process JSON data like below:

[{ "field1": "data1", "field2": 100, "field3": "more data1", "field4": 123.001 }, { "field1": "data2", "field2": 200, "field3": "more data2", "field4": 123.002 }, { "field1": "data3", "field2": 300, "field3": "more data3", "field4": 123.003 }, { "field1": "data4", "field2": 400, "field3": "more data4", "field4": 123.004 }] I have read that multiline JSON data wasn't supported in HIVE but can I use it now?? If yes, then please share the links where I can find this because I searched alot for this but I couldn't find any relevant material for this. Thanks in advance .

debuggerrr commented 6 years ago

Can i have an answer for this ? @rcongiu

vincenzocapel commented 5 years ago

Has someone an answer to this question? I have the same issue. thanks

fayedd commented 5 years ago

Since a nomal json should start wih '{' and end with '}', maybe you can remove the '[' and ']' and then set row format delimited fields terminated by ','