big-data-europe / docker-hadoop

Apache Hadoop docker image
2.2k stars 1.3k forks source link

not able to run mapred streaming -- subprocess failed with code 127 #95

Open newoz1 opened 3 years ago

newoz1 commented 3 years ago

I am trying to run a mapreduce job, and I get the following error on the namenode

"docker exec -it namenode bash"

mapred streaming -files ./mapper.py,./reducer.py -mapper mapper.py -reducer mapper.py -input input -output output8 Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127

then I tried to run the py scripts directly in python but it seems that python it is not installed in the container. I am right?

headers on mapper and reducer are

!/usr/bin/env python

--coding:utf-8 -

ii02735 commented 3 years ago

You're right, please make sure that python is installed in namenode, datanode, resourcemanager and nodemanager

JanaFaganeli commented 3 years ago

I have been having the same issue: subprocess failed with code 127. Looks like python is not installed. How can I fix this issue, how can I install python in the nodes listed above?

danieladriano commented 3 years ago

@JanaFaganeli Were you able to install python in the nodes? Thanks

ii02735 commented 3 years ago

@JanaFaganeli You must install python or python3 (it will depend of your python syntax in your scripts) into the differents nodes. To do so, please use these commands (replace python3 by python if needed) :

docker exec -it namenode bash -c "apt update && apt install python3 -y"
docker exec -it datanode bash -c "apt update && apt install python3 -y"
docker exec -it resourcemanager bash -c "apt update && apt install python3 -y"
docker exec -it nodemanager bash -c "apt update && apt install python3 -y"

Of course, make sure that these nodes / containers are running.

danieladriano commented 3 years ago

@ii02735 Thanks :-D

ii02735 commented 3 years ago

@danieladriano You're welcome 🙂

JanaFaganeli commented 3 years ago

Now it works fine. Thank you all for your help.

On Thu, Feb 18, 2021 at 8:42 PM ii02735 notifications@github.com wrote:

@danieladriano https://github.com/danieladriano You're welcome 🙂

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/big-data-europe/docker-hadoop/issues/95#issuecomment-781589644, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJXBOAXFTOCVH5Q7BJ2A5FLS7VUTNANCNFSM4VQSACJQ .

yusufani commented 3 years ago

@JanaFaganeli You must install python or python3 (it will depend of your python syntax in your scripts) into the differents nodes. To do so, please use these commands (replace python3 by python if needed) :

docker exec -it namenode bash -c "apt update && apt install python3 -y"
docker exec -it datanode bash -c "apt update && apt install python3 -y"
docker exec -it resourcemanager bash -c "apt update && apt install python3 -y"
docker exec -it nodemanager bash -c "apt update && apt install python3 -y"

Of course, make sure that these nodes / containers are running.

You are the man of honor

dcguim commented 2 years ago

Thanks for sharing the helpful code snippet @ii02735 ! I was wondering if anyone knows why is it required to install python on the resourcemanager and nodemanager containers, AFAIK they are not running the mapred jobs.

ykhandelwal913 commented 2 years ago

I am still getting the same error though i have installed python or python3

ykhandelwal913 commented 2 years ago

@ii02735 anything else needs to be done?

manoloacademia commented 2 years ago

same :(

syahirazman commented 9 months ago

@JanaFaganeli You must install python or python3 (it will depend of your python syntax in your scripts) into the differents nodes. To do so, please use these commands (replace python3 by python if needed) :

docker exec -it namenode bash -c "apt update && apt install python3 -y"
docker exec -it datanode bash -c "apt update && apt install python3 -y"
docker exec -it resourcemanager bash -c "apt update && apt install python3 -y"
docker exec -it nodemanager bash -c "apt update && apt install python3 -y"

Of course, make sure that these nodes / containers are running.

I'm facing the same issue while running mapper.py and reducer.py. I already tried to install python on each container as stated by @ii02735 . But I keep getting this output (see attached picture). So now I don't have any idea to solve this issue :(

Screenshot 2024-01-27 191026

langdon commented 7 months ago

@syahirazman see below

I was getting the same error. The problem is that the version of debian that is in use is now an archived version. As a result the apt source.list needs to be updated to use the archived version (or the containers need to be updated). You can do this with

create a sources.list file with

deb http://archive.debian.org/debian/ stretch main
deb http://archive.debian.org/debian/ stretch-updates main

then add it to each container you need python in. e.g.

docker cp sources.list namenode:/etc/apt/sources.list

you may have to break up the install line to ignore the lack of security (or find a way to fix it) e.g.

podman exec -it namenode bash -c "apt update || :"
podman exec -it namenode bash -c "apt install python -y"
syahirazman commented 6 months ago

Thank you for the reply!