rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 391 forks source link

Top order Array #200

Open sushantkhanayat opened 6 years ago

sushantkhanayat commented 6 years ago

Hi, I am able to insert the data in the table but only first record is getting picked and for other record it's coming NULL, i am pretty sure its not able to find out the second record in the array. Please help me out here. I have a sample file in below format:

[{"expiration_date":"2003-04-04","county_parish":"Bee","acres":20.00000000,"grantee_alias":"ROBERT H ADAMS","bonus":null,"instrument_date":"2000-04-04","grantee_address":"P. O. Box 20602 , Oklahoma City, OK 73156-0602","state_province":"TX","instrument_type":"Memo of lease","effective_date":null,"maximum_depth":0,"county_state":"Bee (TX)","volume_page":"629/734","id":168,"grantee_name":"Robert H. Adams","record_number":"118567","blm_lease":0,"grantor_address":"235 West Jefferson Street , Tipton, IN 46072-1823","term_months":36,"extension_bonus":null,"royalty":null,"extension_term_month":null,"record_date":"2000-05-08","minimum_depth":0,"grantor_name":"Earl E. Nelson, et ux","geometry":{"type":"Polygon","coordinates":[[[-98.02555516,28.65628727],[-98.02763665,28.6562646],[-98.02767242,28.65998903],[-98.02559752,28.65996315],[-98.02555516,28.65628727]]]},"state_lease":0,"extension":0},{"expiration_date":"2003-03-23","county_parish":"Bee","acres":36.28000000,"grantee_alias":"SOUTHERN RESOURCE","bonus":null,"instrument_date":"2000-03-23","grantee_address":"P. O. Box 6483 , Corpus Christi, TX 78466","state_province":"TX","instrument_type":"Lease","effective_date":null,"maximum_depth":null,"county_state":"Bee (TX)","volume_page":"629/946","id":169,"grantee_name":"Southern Resources Company","record_number":"10","blm_lease":0,"grantor_address":"726 Texas Avenue , Port Neches, TX 77651","term_months":36,"extension_bonus":null,"royalty":0.20000000,"extension_term_month":null,"record_date":"2000-05-10","minimum_depth":0,"grantor_name":"Winnie Harris Brown, et al","geometry":{"type":"Polygon","coordinates":[[[-97.83107188,28.43751166],[-97.83275784,28.4404257],[-97.83025018,28.44150562],[-97.82625245,28.43967666],[-97.83107188,28.43751166]]]},"state_lease":0,"extension":0}]

DDL i have come up with is : CREATE TABLE test( a ARRAY< struct <expiration_date:Date, county_parish:String, acres:float, grantee_alias:string, bonus:string, instrument_date:date, grantee_address:string, state_province:string, instrument_type:string, effective_date:date, maximum_depth:int, county_state:string, volume_page:string, id:INT, grantee_name:string, record_number:int, blm_lease:int, grantor_address:STRING, term_months:INT, extension_bonus:STRING, royalty:DOUBLE, extension_term_month:STRING, record_date:DATE, minimum_depth:INT, grantor_name:STRING, geometry:STRUCT< type: STRING, coordinates: ARRAY< ARRAY< ARRAY< ARRAY< DOUBLE > > > > >, state_lease:INT, extension:INT

)ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' stored as textfile;

YevheniiStepaniuk commented 5 years ago

@sushantkhanayat I've faced with the same. Have you any solution to fix that?

rcongiu commented 4 years ago

I see, the issue is that your record is actually an array. In hive a record has to be a struct.