yc-huang / Hive-mongo

hive storage handler for connecting with MongoDB
Apache License 2.0
32 stars 33 forks source link

metadata exception #20

Open Deepak-Vohra opened 9 years ago

Deepak-Vohra commented 9 years ago

A SELECT query on a Hive table stored By MongoDb storage handler generates org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null

yc-huang commented 9 years ago

could you post the full exception stack?

Deepak-Vohra commented 9 years ago

Both SELECT and INSERT OVERWRITE generate the NumberFormatException: null.

  1. Create external table:

CREATE EXTERNAL TABLE wlslog_mongo(id string,CATEGORY string,TYPE string, SERVERNAME string,CODE string,MSG string)

  STORED BY 'org.yong3.hive.mongo.MongoStorageHandler'
  WITH SERDEPROPERTIES ("mongo.column.mapping"="_id,CATEGORY,TYPE,SERVERNAME,CODE,MSG")
  TBLPROPERTIES ( "mongo.host" = "10.0.2.15", "mongo.port" = "27017",

"mongo.db" = "test", "mongo.user" = "mongo", "mongo.passwd" = "mongo", "mongo.collection" = "wlslog"); OK Time taken: 23.083 seconds

  1. Run a select query: hive> select * from wlslog_mongo; OK Failed with exception java.io.IOException:java.lang.NumberFormatException: null Time taken: 24.752 seconds
  2. Add data with INSERT OVERWRITE.

hive> INSERT OVERWRITE TABLE wls_mongo SELECT time_stamp,category,type,servername,code,msg FROM wlslog; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/cdh/hadoop-2.6.0-cdh5.4.7/share/hadoop/mapreduce1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/cdh/hadoop-2.6.0-cdh5.4.7/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/09/29 14:03:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/09/29 14:03:13 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead Execution log at: /tmp/root/root_20150929140101_81084117-f4ac-4a3a-bc97-9a0977cffe3e.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2015-09-29 14:04:18,810 null map = 0%, reduce = 0% org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:289) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:506) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:458) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:550) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:454) at java.lang.Integer.valueOf(Integer.java:582) at org.yong3.hive.mongo.MongoTable.(MongoTable.java:20) at org.yong3.hive.mongo.MongoWriter.(MongoWriter.java:24) at org.yong3.hive.mongo.MongoOutputFormat.getHiveRecordWriter(MongoOutputFormat.java:29) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:299) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:286) ... 19 more org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:289) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:506) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:458) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:847) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:230) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:454) at java.lang.Integer.valueOf(Integer.java:582) at org.yong3.hive.mongo.MongoTable.(MongoTable.java:20) at org.yong3.hive.mongo.MongoWriter.(MongoWriter.java:24) at org.yong3.hive.mongo.MongoOutputFormat.getHiveRecordWriter(MongoOutputFormat.java:29) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:299) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:286) ... 17 more org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:847) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:230) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:289) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:506) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:458) ... 15 more Caused by: java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:454) at java.lang.Integer.valueOf(Integer.java:582) at org.yong3.hive.mongo.MongoTable.(MongoTable.java:20) at org.yong3.hive.mongo.MongoWriter.(MongoWriter.java:24) at org.yong3.hive.mongo.MongoOutputFormat.getHiveRecordWriter(MongoOutputFormat.java:29) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:299) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:286) ... 17 more org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:847) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:230) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:289) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:506) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:458) ... 15 more Caused by: java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:454) at java.lang.Integer.valueOf(Integer.java:582) at org.yong3.hive.mongo.MongoTable.(MongoTable.java:20) at org.yong3.hive.mongo.MongoWriter.(MongoWriter.java:24) at org.yong3.hive.mongo.MongoOutputFormat.getHiveRecordWriter(MongoOutputFormat.java:29) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:299) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:286) ... 17 more org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:847) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:230) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:289) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:506) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:458) ... 15 more Caused by: java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:454) at java.lang.Integer.valueOf(Integer.java:582) at org.yong3.hive.mongo.MongoTable.(MongoTable.java:20) at org.yong3.hive.mongo.MongoWriter.(MongoWriter.java:24) at org.yong3.hive.mongo.MongoOutputFormat.getHiveRecordWriter(MongoOutputFormat.java:29) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:299) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:286) ... 17 more Ended Job = job_local432685489_0001 with errors Error during job, obtaining debugging information... Execution failed with exit status: 2 Obtaining error information

Task failed! Task ID: Stage-0

Logs:

/tmp/root/hive.log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask hive>

yc-huang commented 9 years ago

according to the exception stack, the NumberFormatException is thrown when parsing the mongo db port:

this.db = new Mongo(host, Integer.valueOf(port)).getDB(dbName);

and it seems that the port specified in the tblproperties was not read properly (the seen value is null).

Pls make sure there are no abnormal character in the create table sql; and you could also try with latest stable version of hive (I tried Hive 1.2.1 and it works for me). If it's not resolved, maybe we need to add some verbose log to help to identify the problem.