mattshma / bigdata

hadoop,hbase,storm,spark,etc..
161 stars 79 forks source link

MR job运行成功却没数据 #29

Closed mattshma closed 8 years ago

mattshma commented 8 years ago

虽然在rm web中看相关job最终状态是成功的,但实际该job却没产生数据。重新跑了遍业务方代码,有如下信息:

/usr/bin/../bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/../bin/hadoop: fork: retry: Resource temporarily unavailable
/opt/cloudera/parcels/CDH-5.1.4-1.cdh5.1.4.p0.15/bin/../lib/hadoop/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/../bin/hadoop: fork: retry: Resource temporarily unavailable
/opt/cloudera/parcels/CDH-5.1.4-1.cdh5.1.4.p0.15/bin/../lib/hadoop/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/../bin/hadoop: fork: retry: Resource temporarily unavailable
/opt/cloudera/parcels/CDH-5.1.4-1.cdh5.1.4.p0.15/lib/hadoop/libexec/hadoop-layout.sh: fork: retry: Resource temporarily unavailable
....
16/05/13 11:36:58 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:714)
    at org.apache.hadoop.net.unix.DomainSocketWatcher.<init>(DomainSocketWatcher.java:239)
    at org.apache.hadoop.hdfs.client.DfsClientShmManager.<init>(DfsClientShmManager.java:413)
    at org.apache.hadoop.hdfs.client.ShortCircuitCache.<init>(ShortCircuitCache.java:382)
    at org.apache.hadoop.hdfs.ClientContext.<init>(ClientContext.java:96)
    at org.apache.hadoop.hdfs.ClientContext.get(ClientContext.java:145)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:601)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:521)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:146)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:204)
    at org.apache.hadoop.hive.ql.Context.getExternalScratchDir(Context.java:271)
    at org.apache.hadoop.hive.ql.Context.getExternalTmpFileURI(Context.java:364)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:5000)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:7440)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:7332)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8135)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8361)
    at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:317)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:454)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:352)
    at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:995)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1038)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:357)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
...
java.io.IOException: Unable to close file because the last block does not have enough number of replicas.
    at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2142)
    at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2110)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:70)
    at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:103)
    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54)
    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:140)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:213)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)java.io.IOException: Unable to close file because the last block does not have enough number of replicas.
...
16/05/13 11:39:35 WARN mapreduce.ExportJobBase: Input path hdfs://hadoop/user/hive/warehouse/data_analyze.db//grouptype=*/game_id=97/ds=20160512/* does not exist
16/05/13 11:46:40 ERROR tool.ExportTool: Encountered IOException running export job: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input Pattern hdfs://hadoop/user/hive/warehouse/data_analyze.db/game_id=97/ds=20160512/* matches 0 files
...
16/05/13 11:51:27 INFO mapreduce.Job: Task Id : attempt_1449806584223_1947763_m_000000_0, Status : FAILED
Error: java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.io.IOException: java.sql.BatchUpdateException: Data truncation: Data too long for column 'date' at row 1
    at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
    at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
    ... 10 more
Caused by: java.sql.BatchUpdateException: Data truncation: Data too long for column 'date' at row 1
    at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1981)
    at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1388)
    at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:231)
Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'date' at row 1
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4224)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2840)
    at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2082)
    at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2334)
    at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1933)
    ... 2 more
...
java.net.SocketTimeoutException: 75000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.2.1.13:45802 remote=/10.2.72.25:50010]
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
    at java.io.FilterInputStream.read(FilterInputStream.java:83)
    at java.io.FilterInputStream.read(FilterInputStream.java:83)
    at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1986)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1281)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
16/05/13 12:02:57 INFO hdfs.DFSClient: Abandoning BP-1471860497-10.2.72.29-1421306158975:blk_1363898175_290240818
16/05/13 12:02:57 INFO hdfs.DFSClient: Excluding datanode 10.2.72.25:50010
mattshma commented 8 years ago

Yarn添加-Xss256k