rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 393 forks source link

one question on mapping #156

Closed updatex closed 7 years ago

updatex commented 8 years ago

one row :

{"name":"ccc", "attributes":{ 
                                            "sex":"male",
                                            "details": { 
                                                            "hobby":"basketball",
                                                           "Hobby":"fooball"
                                                            }
                                             }
}

can i use the follwing ddl?

create table abc (
name string,
attributes struct< sex:string,  details:map<string,string> >
) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 
  "attributes.details.hobby"= "attributes.details.hobby1" ,
   "attributes.details.Hobby"= "attributes.details.hobby2") 
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
;

I see someone mentioned similar questions. #104 Hope someone can help me.

rcongiu commented 8 years ago

mapping works to map column -> json attribute but you can't specify the whole json path, just the atttribute. In your case, this should work: WITH SERDEPROPERTIES( "mapping.hobby1"="hobby", "mapping.hobby2"="Hobby")

hidarapaneni commented 8 years ago

hi rcongiu,

I tried same approach but still giving me duplicate issue. Please advice.

Here are details : json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar

sample json : {"Time":"forme","time":"foryou"}

ddl : create external table sample_dup1(duptime1 string, duptime2 string) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' WITH SERDEPROPERTIES ( "mapping.duptime1"="Time", "mapping.duptime2"="time") location '/user/hive/warehouse/samp_dup_test/';