Closed baristahell closed 7 years ago
Hi @baristahell ,
How did you set up Hadoop and Spark for the Docker Image ? Were you able to ssh
into your localhost ?
Hello @arundasan91 , i have a similar Dockerfile as https://hub.docker.com/r/cimenx/caffeonspark/~/dockerfile/ , so the Hadoop and Spark installations are managed by the CaffeOnSpark scripts, i didn't change this part.
I haven't done any kind of ssh into localhost yet.
@baristahell I see this:
java.lang.UnsatisfiedLinkError: no caffedistri in java.library.path
It's likely your LD_LIBRARY_PATH was not set properly. somewhere you should have libcaffedistri.so after compilation, then LD_LIBRARY_PATH should be set accordingly.
@baristahell , Please check whether:
jps
.I have a working docker container. I had to make sure that Hadoop is working fine.
Also, in the dockerfile you shared, please correct the ENV variables. Some of them have =
sign in them. That is not necessary. Please make sure all the environment variables required are set.
To set the LD_LIBRARY_PATH, please include this to your docker file
ENV LD_LIBRARY_PATH $LD_LIBRARY_PATH:$CAFFE_ON_SPARK/caffe-public/distribute/lib:$CAFFE_ON_SPARK/caffe-distri/distribute/lib
Thanks for the answers. Ok so i already had the library path already set but after the make build
, i put it before this step but it didn't change the output. When i run jps in the temporary image i get 15 Jps
, i don't really get what it means.
If you have the time to check the Dockerfile i use i'll put it there, i don't really see what's wrong and i bet there's a lot. If you're able to share your Dockerfile it would be amazing, i'm struggling with this thing.
## Reference:
## https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_local
## https://github.com/yahoo/CaffeOnSpark/blob/master/caffe-grid/src/test/python/PythonTest.sh
FROM ubuntu:14.04
ENV pwd /root
## Install git, wget, bc and dependencies
# https://github.com/Kaixhin/dockerfiles/blob/master/caffe/deps/Dockerfile
RUN apt-get update && apt-get install -y \
git \
wget \
bc \
cmake \
libatlas-base-dev \
libatlas-dev \
libboost-all-dev \
libopencv-dev \
libprotobuf-dev \
libgoogle-glog-dev \
libgflags-dev \
protobuf-compiler \
libhdf5-dev \
libleveldb-dev \
liblmdb-dev \
libsnappy-dev \
python-dev \
python-pip \
python-numpy \
maven \
software-properties-common \
gfortran > /dev/null && \
pip install --upgrade pip
## Install Oracle Java 8
# https://github.com/dockerfile/java/blob/master/oracle-java8/Dockerfile
RUN \
echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections && \
add-apt-repository -y ppa:webupd8team/java && \
apt-get update && \
apt-get install -y oracle-java8-installer && \
rm -rf /var/lib/apt/lists/* && \
rm -rf /var/cache/oracle-jdk8-installer
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
# Get CaffeOnSpark
RUN cd ${pwd} && git clone https://github.com/cimenx/CaffeOnSpark.git --recursive
ENV CAFFE_ON_SPARK ${pwd}/CaffeOnSpark
# Configure CaffeOnSpark
ADD requirements.txt ${CAFFE_ON_SPARK}/caffe-public/python/
RUN cd ${CAFFE_ON_SPARK}/caffe-public/ && \
pip install -U -r python/requirements.txt
# Add OpenBlas & protobuf
RUN cd root && mkdir temp && cd temp && mkdir OpenBlas-0.2.19 && mkdir protobuf-2.5.0
ADD OpenBLAS-0.2.19 /OpenBlas-0.2.19
ADD protobuf-2.5.0 /protobuf-2.5.0
RUN cd OpenBlas-0.2.19 && make
RUN cd ../protobuf-2.5.0 && ./configure && make && make check && make install
# Add settings xml for Maven
ADD settings.xml /root/.m2/settings.xml
RUN cd ${CAFFE_ON_SPARK}/caffe-public/
# Install Spark and Hadoop
ADD Makefile.config ${CAFFE_ON_SPARK}/caffe-public/
RUN cd ${pwd} && bash ${CAFFE_ON_SPARK}/scripts/local-setup-hadoop.sh && \
bash ${CAFFE_ON_SPARK}/scripts/local-setup-spark.sh \
ENV HADOOP_HOME ${pwd}/hadoop-2.6.4
ENV SPARK_HOME ${pwd}/spark-1.6.0-bin-hadoop2.6
ENV PATH ${HADOOP_HOME}/bin:${SPARK_HOME}/bin:${SPARK_HOME}/sbin:${PATH}
ENV LD_LIBRARY_PATH $LD_LIBRARY_PATH:$CAFFE_ON_SPARK/caffe-public/distribute/lib:$CAFFE_ON_SPARK/caffe-distri/distribute/lib
# Build CaffeOnSpark
RUN cd ${CAFFE_ON_SPARK} && \
make build
# ENV LD_LIBRARY_PATH ${CAFFE_ON_SPARK}/caffe-public/distribute/lib #:${CAFFE_ON_SPARK}/caffe-distri/distribute/lib
RUN cd ${CAFFE_ON_SPARK}/data/ && unzip ${CAFFE_ON_SPARK}/caffe-grid/target/caffeonsparkpythonapi.zip
EXPOSE 8080 7077 8081
WORKDIR /root/spark-1.6.0-bin-hadoop2.6/
ENTRYPOINT ["IPYTHON=1 pyspark", \
"--driver-library-path", "${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar", \
"--driver-class-path", "${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar", \
"--jars", "${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar", \
"--py-files", "${CAFFE_ON_SPARK}/caffe-grid/target/caffeonsparkpythonapi.zip", \
"--files", "${CAFFE_ON_SPARK}/data/caffe/_caffe.so", \
"--conf", "spark.driver.extraLibraryPath=${LD_LIBRARY_PATH}", \
"--conf", "spark.executorEnv.LD_LIBRARY_PATH=${LD_LIBRARY_PATH}", \
"--conf", "spark.executorEnv.DYLD_LIBRARY_PATH=${LD_LIBRARY_PATH}"
]
(I just realized i cloned the repo from cimenx and not from yahoo, but same error again)
@baristahell , I will take a look at the Dockerfile.
Please add this to your docker file for passwordless SSH.
# Passwordless SSH
RUN ssh-keygen -y -q -N "" -t dsa -f /etc/ssh/ssh_host_dsa_key
RUN ssh-keygen -y -q -N "" -t rsa -f /etc/ssh/ssh_host_rsa_key
RUN ssh-keygen -q -N "" -t rsa -f /root/.ssh/id_rsa
RUN cp /root/.ssh/id_rsa.pub ~/.ssh/authorized_keys
From what I found, I you should also copy the xml
config files in CaffeOnSpark\scripts
to your HADOOP_HOME\etc\hadoop
manually. That ensures your hadoop environment have all the defaults set.
libprotobuf-dev
and libopenblas-dev
are enough for installing protobuf and openblas. If you are not doing heavy programming using them, there is no necessity to build them from source.
Even if I had the SPARK_HOME\bin
in my $PATH
variable, I was not able to run spark-submit
from root. So it will be better to pass them to the docker file like this:
ENV HADOOP_HOME=/usr/local/hadoop
ENV SPARK_HOME=/usr/local/spark
ENV PATH $PATH:$JAVA_HOME/bin
ENV PATH $PATH:$HADOOP_HOME/bin
ENV PATH $PATH:$SPARK_HOME/bin
ENV PATH $PATH:$SPARK_HOME/sbin
jps
, it lists the java processes running. If you have properly set-up hadoop, when you run start-dfs.sh
and start-yarn.sh
you should get something like this when you run jps
.
root@85e8ee9d8f41:~# jps
273 DataNode
145 NameNode
452 SecondaryNameNode
740 NodeManager
1240 Jps
633 ResourceManager
Since you are only getting jps
, I doubt there are problems with hadoop config. Getting hadoop to work on docker is not trivial. Please check this link for more info on installing hadoop on Docker.
https://hub.docker.com/r/sequenceiq/hadoop-docker/~/dockerfile/
I hope I will be able to push a Dockerfile to CaffeOnSpark repo soon. Will definitely update here as well.
In the mean time, please check whether this docker image works for you.
docker pull arundas/caffeonspark
Tag is v1
.
You'll have to continue from Step 7) Install mnist and cifar10 dataset into its HDFS
of the Getting Started Guide "https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_yarn"
@baristahell , If you do not want to download the image as such, there is a Dockerfile
attached with PR #208. Please use it to create a container and let us know if it worked.
Thanks for the help! I'll check that tomorrow and will keep you updated. I already have the cluster working (well, one never knows) so i should be able to tell you soon, if i managed to get it to run. I'll be trying both the image and the Dockerfile and will tell you how it went
Great. Take your time.
First update: the image works and i am able to create a working container.
The only issue for now was that YARN_CONFIG_DIR wasn't defined and i had to set it manually before submitting the test job.
@baristahell , Thanks for the update. I had to give either HADOOP_CONF_DIR or YARN_CONF_DIR. So I chose the former. Should have defined both. Will surely update the image.
While using the dockerfile mentioned in the PR, you will have to copy the supporting files in config directory as well. YARN_CONF_DIR should be updated in the dockerfile too, if required.
Thanks.
Adding this to the docker file will mitigate the error that you faced:
ENV YARN_CONF_DIR /usr/local/hadoop/etc/hadoop
I should have missed the HADOOP_CONF_DIR in the image uploaded to docker hub. I have already included HADOOP_CONF_DIR in the dockerfile in #208. This docker file should work out of the box without any errors since I am getting it to work.
So, I'm trying to build a docker image for CaffeOnSpark, but the make step doesn't work. I based my Dockerfile on the CaffeOnSpark tutorial and added OpenBlas and Protobuf (had issues because they were missing). Still, I have issues about the test step that prevent me from successfully building the image. Can you help me? I don't really get it.
(It seems to work until there but i start getting some warnings. Not sure if they're that problematic though)
You can see here that the building step was successful