big-data-europe / docker-hadoop

Apache Hadoop docker image
2.18k stars 1.27k forks source link

[Begginer] How to install python ? #109

Closed axel584 closed 3 years ago

axel584 commented 3 years ago

Hi, I would like to try an hadoop cluster with a python script. I launch the cluster with "docker-compose up -d" => it's work fine I log in the namenode with "docker exec -it namenode /bin/bash" => it's work fine I try to search the python package with "apt search python" but I found nothing which seem to be a python interpreter. I would like to add "pip" and the "mrjob" package...

Thank you for your help,

Axel

Gianlucamariani1996 commented 3 years ago

Hi,

I've done this sequentially: "docker exec -i -t -u root namenode /bin/bash" to enter in the namenode in container: "apt-get update -y" "apt-get install -y python3" "hadoop jar /opt/hadoop-3.2.1/share/hadoop/tools/lib/hadoop-streaming-3.2.1.jar -mapper "container-mapper-file-path" -reducer "container-reducer-file-path" -input "hdfs-input-file-path" -output "hdfs-output-file-path"" it's work fine for me.

Gianluca

axel584 commented 3 years ago

Thank you very much, "apt update" was the missing command ;-)