rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 391 forks source link

sqoop export DATE columns from Hive to ORACLE #221

Open yaronkalatian opened 4 years ago

yaronkalatian commented 4 years ago

I've created an external table in HIVE over Parquet: drop table H_LINEITEM_EXT; create external table H_LINEITEM_EXT ( L_ORDERKEY string, L_PARTKEY string, L_SUPPKEY string, L_LINENUMBER string, L_QUANTITY string, L_EXTENDEDPRICE string, L_DISCOUNT string, L_TAX string, L_RETURNFLAG string, L_LINESTATUS string, L_SHIPDATE date , L_COMMITDATE date , L_RECEIPTDATE date , L_SHIPINSTRUCT string, L_SHIPMODE string, L_COMMENT string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat" OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat" LOCATION '/hdata/H_LINEITEM';

I want to export this table to ORACLE with SQOOP:

sqoop export \ -Dsqoop.export.records.per.statement=1000000 \ -Dsqoop.export.statements.per.transaction=1000000 \ --connect "jdbc:oracle:thin:@192.168.1.32:1522:orcl" --password "system" --username "sys as sysdba" \ --table "TPCH.HIVE_LINEITEM" --hcatalog-table "H_LINEITEM_EXT" --hcatalog-database "default" --hcatalog-home /home/hadoop/hive \ -m 3 \ --batch \ --verbose

The process failed with the following exception:

2019-09-03 18:23:07,643 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot be cast to java.sql.Date at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportHelper.convertToSqoop(SqoopHCatExportHelper.java:193) at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportHelper.convertToSqoopRecord(SqoopHCatExportHelper.java:138) at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportMapper.map(SqoopHCatExportMapper.java:56) at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportMapper.map(SqoopHCatExportMapper.java:35) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)

Specific version I use:

hadoop-3.1.1 sqoop-1.4.7 I've tried with :

--map-column-java "l_shipdate=String,l_commitdate=String,l_receiptdate=String" or --map-column-java "L_SHIPDATE=java.sql.Date,L_COMMITDATE=java.sql.Date,L_RECEIPTDATE=java.sql.Date" But no luck.

If I create a table in Hive with all string columns from the external table H_LINEITEM_EXT, then I can manage to import that table successfully. But this is not a good solution because it duplicates the time and space.

Could you please advise me on the right solution?