Esri / spatial-framework-for-hadoop

The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data processing system for spatial data analysis.
Apache License 2.0
367 stars 159 forks source link

Insert, Update and delete for geospatial data in hive #107

Open SrinivasRIL opened 8 years ago

SrinivasRIL commented 8 years ago

Hi, I wanted to update and delete some records in a table in hive that contains shape attributes and i was reading this article http://unmeshasreeveni.blogspot.in/2014/11/updatedeleteinsert-in-hive-0140.html.

So basically I need to create a hive table with ACID support , but however we are not using ORC file format, instead we are using unenclosed json format. SO does that mean that we cant update or delete any records with the uneclosed json format??. Or if there is, can you suggest a way to go about it.

Thanks

randallwhitman commented 8 years ago

Have you tried it out, and what was the result?

SrinivasRIL commented 8 years ago

I have set all those properties as described in the article and created a bucketed table but the usual way as an unenclosed json format and loaded the data. Obviously after updating I dint get the expected results because i dint store it as an orc file

update bucketenode set businessranking ='2' where state = 'Goa'

I got the following error

Attempt to do update or delete using transaction manager that does not support these operations.

I set the following properties

hive>set hive.support.concurrency =true;
 hive> set hive.enforce.bucketing = true;
hive> set hive.exec.dynamic.partition.mode = nonstrict ;
hive> set hive.txn.manager =org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
hive> set  hive.compactor.initiator.on = true;
hive> set hive.compactor.worker.threads = 1;

I am not sure how to store it as an orc file when we have to specify STORED AS INPUTFORMAT 'com.esri.json.hadoop.UnenclosedJsonInputFormat' while we create a table with a json file appended to it.

If you can provide any insights on this problem, it would be great as we need to start updating records in the near future

randallwhitman commented 8 years ago

I am not sure how to store it as an orc file when we have to specify STORED AS INPUTFORMAT 'com.esri.json.hadoop.UnenclosedJsonInputFormat' while we create a table with a json file appended to it.

JSON and ORC are completely different formats. You can choose to work with data in its original format, or convert one format to another.

randallwhitman commented 7 years ago

This is Open-Source and accepts contributions.