RevolutionAnalytics / rmr2

A package that allows R developer to use Hadoop MapReduce
160 stars 149 forks source link

Streaming error: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 #112

Open babumathew opened 10 years ago

babumathew commented 10 years ago

I have a 4 node cluster and R, Hadoop, rmr2 installed on all the nodes. Running the a sample job produce the following errors. I am not sure where to look.. Any insights will be very helpful

thanks Babu

14/05/20 12:02:47 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead. packageJobJar: [/tmp/RtmpcjltyX/rmr-local-env66c625272a9d, /tmp/RtmpcjltyX/rmr-global-env66c65806b668, /tmp/RtmpcjltyX/rmr-streaming-map66c62b1b7c09, /tmp/RtmpcjltyX/rmr-streaming-reduce66c6674c771a] [/opt/cloudera/parcels/CDH-5.0.0-0.cdh5b2.p0.27/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-2.jar] /tmp/streamjob2642347914010493291.jar tmpDir=null 14/05/20 12:02:48 INFO client.RMProxy: Connecting to ResourceManager at HadoopS.dbstraining.local/192.168.100.40:8032 14/05/20 12:02:48 INFO client.RMProxy: Connecting to ResourceManager at HadoopS.dbstraining.local/192.168.100.40:8032 14/05/20 12:02:49 INFO mapred.FileInputFormat: Total input paths to process : 1 14/05/20 12:02:49 INFO mapreduce.JobSubmitter: number of splits:2 14/05/20 12:02:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1395320373286_0043 14/05/20 12:02:49 INFO impl.YarnClientImpl: Submitted application application_1395320373286_0043 14/05/20 12:02:49 INFO mapreduce.Job: The url to track the job: http://HadoopS.dbstraining.local:8088/proxy/application_1395320373286_0043/ 14/05/20 12:02:49 INFO mapreduce.Job: Running job: job_1395320373286_0043 14/05/20 12:02:56 INFO mapreduce.Job: Job job_1395320373286_0043 running in uber mode : false 14/05/20 12:02:56 INFO mapreduce.Job: map 0% reduce 0% 14/05/20 12:03:00 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000000_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:02 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000001_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:07 INFO mapreduce.Job: map 50% reduce 0% 14/05/20 12:03:08 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000001_1, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:13 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000001_2, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:19 INFO mapreduce.Job: map 100% reduce 100% 14/05/20 12:03:19 INFO mapreduce.Job: Job job_1395320373286_0043 failed with state FAILED due to: Task failed task_1395320373286_0043_m_000001 Job failed as tasks failed. failedMaps:1 failedReduces:0

14/05/20 12:03:19 INFO mapreduce.Job: Counters: 31 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=93541 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=5678 HDFS: Number of bytes written=0 HDFS: Number of read operations=5 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed map tasks=5 Launched map tasks=6 Other local map tasks=4 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=23439 Total time spent by all reduces in occupied slots (ms)=0 Map-Reduce Framework Map input records=1 Map output records=0 Map output bytes=0 Map output materialized bytes=192 Input split bytes=110 Combine input records=0 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=33 CPU time spent (ms)=850 Physical memory (bytes) snapshot=484114432 Virtual memory (bytes) snapshot=1568194560 Total committed heap usage (bytes)=582287360 File Input Format Counters Bytes Read=5568 14/05/20 12:03:19 ERROR streaming.StreamJob: Job not Successful! Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 14/05/20 12:03:27 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://HadoopS.dbstraining.local:8020/tmp/file66c65a3a2c24' to trash at: hdfs://HadoopS.dbstraining.local:8020/user/bmathew/.Trash/Current

akhil29sep commented 10 years ago

This error occurs when mapper is unable to read input file. try with simple input file and check first.

piccolbo commented 10 years ago

Please follow bug reporting guidelines in the wiki. From this log nobody other than the other user who replied can make a guess as to what went wrong, certainly not I. On May 23, 2014 10:53 PM, "babumathew" notifications@github.com wrote:

I have a 4 node cluster and R, Hadoop, rmr2 installed on all the nodes. Running the a sample job produce the following errors. I am not sure where to look.. Any insights will be very helpful

thanks Babu

14/05/20 12:02:47 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead. packageJobJar: [/tmp/RtmpcjltyX/rmr-local-env66c625272a9d, /tmp/RtmpcjltyX/rmr-global-env66c65806b668, /tmp/RtmpcjltyX/rmr-streaming-map66c62b1b7c09, /tmp/RtmpcjltyX/rmr-streaming-reduce66c6674c771a] [/opt/cloudera/parcels/CDH-5.0.0-0.cdh5b2.p0.27/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-2.jar] /tmp/streamjob2642347914010493291.jar tmpDir=null 14/05/20 12:02:48 INFO client.RMProxy: Connecting to ResourceManager at HadoopS.dbstraining.local/192.168.100.40:8032 14/05/20 12:02:48 INFO client.RMProxy: Connecting to ResourceManager at HadoopS.dbstraining.local/192.168.100.40:8032 14/05/20 12:02:49 INFO mapred.FileInputFormat: Total input paths to process : 1 14/05/20 12:02:49 INFO mapreduce.JobSubmitter: number of splits:2 14/05/20 12:02:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1395320373286_0043 14/05/20 12:02:49 INFO impl.YarnClientImpl: Submitted application application_1395320373286_0043 14/05/20 12:02:49 INFO mapreduce.Job: The url to track the job: http://HadoopS.dbstraining.local:8088/proxy/application_1395320373286_0043/ 14/05/20 12:02:49 INFO mapreduce.Job: Running job: job_1395320373286_0043 14/05/20 12:02:56 INFO mapreduce.Job: Job job_1395320373286_0043 running in uber mode : false 14/05/20 12:02:56 INFO mapreduce.Job: map 0% reduce 0% 14/05/20 12:03:00 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000000_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:02 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000001_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:07 INFO mapreduce.Job: map 50% reduce 0% 14/05/20 12:03:08 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000001_1, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:13 INFO mapreduce.Job: Task Id : attempt_1395320373286_0043_m_000001_2, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/05/20 12:03:19 INFO mapreduce.Job: map 100% reduce 100% 14/05/20 12:03:19 INFO mapreduce.Job: Job job_1395320373286_0043 failed with state FAILED due to: Task failed task_1395320373286_0043_m_000001 Job failed as tasks failed. failedMaps:1 failedReduces:0

14/05/20 12:03:19 INFO mapreduce.Job: Counters: 31 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=93541 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=5678 HDFS: Number of bytes written=0 HDFS: Number of read operations=5 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed map tasks=5 Launched map tasks=6 Other local map tasks=4 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=23439 Total time spent by all reduces in occupied slots (ms)=0 Map-Reduce Framework Map input records=1 Map output records=0 Map output bytes=0 Map output materialized bytes=192 Input split bytes=110 Combine input records=0 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=33 CPU time spent (ms)=850 Physical memory (bytes) snapshot=484114432 Virtual memory (bytes) snapshot=1568194560 Total committed heap usage (bytes)=582287360 File Input Format Counters Bytes Read=5568 14/05/20 12:03:19 ERROR streaming.StreamJob: Job not Successful! Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 14/05/20 12:03:27 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://HadoopS.dbstraining.local:8020/tmp/file66c65a3a2c24' to trash at: hdfs://HadoopS.dbstraining.local:8020/user/bmathew/.Trash/Current

— Reply to this email directly or view it on GitHub https://github.com/RevolutionAnalytics/rmr2/issues/112.

Esquive commented 9 years ago

Dear,

me too I have a 4 hadoop cluster (Cloudera CDH-5.2.1-1.cdh5.2.1.p0.12). Running the wordcount example in cluster mode I have the same error than the user who initialiazed this thread, but randomly). With Randomly I mean it can occur a single time during a run of the wordcount code or not at all. Sometimes the failing tasks on the nodes get executed properly on another node leading to a succesfull Job but sometimes the job fails because of to many failed tasks.

In local mode the code runs as expected.

The file I use is the file from: http://www.textfiles.com/politics/0814gulf.txt Before running the code the file was cleaned using dos2unix and a control character removal tool. As I said in local mode everything runs fine, and with the same file wordcount from hadoop works as intended.

The following R packages are installed:

Packages in library /usr/local/lib/R/site-library:

bitops                  Bitwise Operations
caTools                 Tools: moving window statistics, GIF, Base64,
                        ROC AUC, etc.
devtools                Tools to make developing R code easier
digest                  Create Cryptographic Hash Digests of R Objects
evaluate                Parsing and evaluation tools that provide more
                        details than the default.
functional              Curry, Compose, and other higher-order
                        functions
httr                    Tools for Working with URLs and HTTP
iterators               Iterator construct for R
itertools               Iterator Tools
jsonlite                A Robust, High Performance JSON Parser and
                        Generator for R
manipulate              Interactive Plots for RStudio
memoise                 Memoise functions
mime                    Map filenames to MIME types
plyr                    Tools for splitting, applying and combining
                        data
R6                      Classes with reference semantics
Rcpp                    Seamless R and C++ Integration
RCurl                   General network (HTTP/FTP/...) client interface
                        for R
reshape2                Flexibly Reshape Data: A Reboot of the Reshape
                        Package.
rhdfs                   R and Hadoop Distributed Filesystem
rJava                   Low-level R to Java interface
RJSONIO                 Serialize R objects to JSON, JavaScript Object
                        Notation
rmr2                    R and Hadoop Streaming Connector
rstudio                 Tools and Utilities for RStudio
rstudioapi              Safely access the RStudio API.
stringr                 Make it easier to work with strings.
whisker                 {{mustache}} for R, logicless templating

Here the stderr + syslog from one of the failed jobs:

 Log Type: stderr

Log Length: 1130

Loading objects:
  wordcount
Loading objects:
  backend.parameters
  combine
  combine.file
  combine.line
  debug
  default.input.format
Please review your hadoop settings. See help(hadoop.settings)
  default.output.format
  in.folder
  in.memory.combine
  input.format
  libs
  map
  map.file
  map.line
  out.folder
  output.format
  pkg.opts
  postamble
  preamble
  profile.nodes
  reduce
  reduce.file
  reduce.line
  rmr.global.env
  rmr.local.env
  save.env
  tempfile
  vectorized.reduce
  verbose
  work.dir
Loading required package: methods
Loading required package: rmr2
Loading required package: rJava
Loading required package: rhdfs

HADOOP_CMD=/usr/bin/hadoop

Be sure to run hdfs.init()
Loading objects:
  backend.parameters
  combine
  combine.file
  combine.line
  debug
  default.input.format
  default.output.format
  in.folder
  in.memory.combine
  input.format
  libs
  map
  map.file
  map.line
  out.folder
  output.format
  pkg.opts
  postamble
  preamble
  profile.nodes
  reduce
  reduce.file
  reduce.line
  rmr.global.env
  rmr.local.env
  save.env
  tempfile
  vectorized.reduce
  verbose
  work.dir

Log Type: stdout

Log Length: 0

Log Type: syslog

Log Length: 8901

Showing 4096 bytes of 8901 total. Click here for the full log.

:NA [rec/s]
2015-01-05 15:07:57,389 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=100/0/0 in:NA [rec/s] out:NA [rec/s]
2015-01-05 15:07:57,407 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=1000/0/0 in:NA [rec/s] out:NA [rec/s]
2015-01-05 15:07:58,981 INFO [Thread-12] org.apache.hadoop.streaming.PipeMapRed: Records R/W=8392/1
2015-01-05 15:07:59,017 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done
2015-01-05 15:07:59,071 WARN [Thread-12] org.apache.hadoop.streaming.PipeMapRed: java.io.EOFException
2015-01-05 15:07:59,072 INFO [main] org.apache.hadoop.streaming.PipeMapRed: PipeMapRed failed!
java.lang.RuntimeException: java.io.EOFException
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:392)
    at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:211)
    at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152)
    at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:56)
    at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)
2015-01-05 15:07:59,075 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: java.io.EOFException
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:392)
    at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:211)
    at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152)
    at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:56)
    at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)

2015-01-05 15:07:59,080 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2015-01-05 15:07:59,087 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://babar.hadoop:8020/user/eric.falk/gulf.out/_temporary/1/_temporary/attempt_1418911299833_0092_m_000001_0
2015-01-05 15:07:59,192 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
2015-01-05 15:07:59,193 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
2015-01-05 15:07:59,193 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.

Thanks Eric

piccolbo commented 9 years ago

@Esquive, please open a separate issues

Bharath25191 commented 7 years ago

I am facing the same issue, Did u find a solution ?

rjrockzz commented 4 years ago

I was facing a similar issue while working with a python script. Just add

#!/usr/bin/python

in the beginning of your scripts. The same can be done for other scripting languages.