microsoft / pai

Resource scheduling and cluster management for AI
https://openpai.readthedocs.io
MIT License
2.64k stars 548 forks source link

Cannot mount hdfs #1581

Closed qyyy closed 5 years ago

qyyy commented 6 years ago

According to this document, I try to install the Hadoop-hdfs-fuse by apt-get install hadoop-hdfs-fuse when I build a docker image. But I failed. The error message is following:

E: The method driver /usr/lib/apt/methods/https could not be found.
W: http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/Release.gpg: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/Release.gpg: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://security.ubuntu.com/ubuntu/dists/xenial-security/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
E: Failed to fetch https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh/dists/trusty-cdh5/InRelease
E: Some index files failed to download. They have been ignored, or old ones used instead.
The command '/bin/sh -c wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb &&     dpkg -i cdh5-repository_1.0_all.deb &&      apt-get -y update && apt-get install hadoop-hdfs-fuse' returned a non-zero code: 100

And my dockerfile is following:

FROM pai.build.mpi:openmpi1.10.4-hadoop2.7.2-cuda8.0-cudnn6-devel-ubuntu16.04

ENV CNTK_VERSION=2.0.beta11.0

RUN apt-get -y update && \
    apt-get -y install git \
        fuse \
        golang \
        libjasper1 \
        libjpeg8 \
        libpng12-0 \
        libgfortran3 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /

# Install hdfs-mount
RUN wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb && \
    dpkg -i cdh5-repository_1.0_all.deb && \
    apt-get -y update && apt-get install hadoop-hdfs-fuse

# Install Anaconda
RUN ANACONDA_PREFIX="/root/anaconda3" && \
    ANACONDA_VERSION="3-4.1.1" && \
    ANACONDA_SHA256="4f5c95feb0e7efeadd3d348dcef117d7787c799f24b0429e45017008f3534e55" && \
    wget -q https://repo.continuum.io/archive/Anaconda${ANACONDA_VERSION}-Linux-x86_64.sh && \
    echo "$ANACONDA_SHA256 Anaconda${ANACONDA_VERSION}-Linux-x86_64.sh" | sha256sum --check --strict - && \
    chmod a+x Anaconda${ANACONDA_VERSION}-Linux-x86_64.sh && \
    ./Anaconda${ANACONDA_VERSION}-Linux-x86_64.sh -b -p ${ANACONDA_PREFIX} && \
    rm -rf Anaconda${ANACONDA_VERSION}-Linux-x86_64.sh && \
    $ANACONDA_PREFIX/bin/conda clean --all --yes

ENV PATH=/root/anaconda3/bin:/usr/local/mpi/bin:$PATH \
    LD_LIBRARY_PATH=/root/anaconda3/lib:/usr/local/mpi/lib:$LD_LIBRARY_PATH

# Get CNTK Binary Distribution
RUN CNTK_VERSION_DASHED=$(echo $CNTK_VERSION | tr . -) && \
    CNTK_SHA256="2e60909020a0f553431dc7f7818401cc1bb2c99eef307d65bb552c497993593a" && \
    wget -q https://cntk.ai/BinaryDrop/CNTK-${CNTK_VERSION_DASHED}-Linux-64bit-GPU.tar.gz && \
    echo "$CNTK_SHA256 CNTK-${CNTK_VERSION_DASHED}-Linux-64bit-GPU.tar.gz" | sha256sum --check --strict - && \
    tar -xzf CNTK-${CNTK_VERSION_DASHED}-Linux-64bit-GPU.tar.gz && \
    rm -f CNTK-${CNTK_VERSION_DASHED}-Linux-64bit-GPU.tar.gz && \
    wget -q https://raw.githubusercontent.com/Microsoft/CNTK-docker/master/ubuntu-14.04/version_2/${CNTK_VERSION}/gpu/runtime/install-cntk-docker.sh \
         -O /cntk/Scripts/install/linux/install-cntk-docker.sh && \
    /bin/bash /cntk/Scripts/install/linux/install-cntk-docker.sh && \
    /root/anaconda3/bin/conda clean --all --yes && \
    rm -rf /cntk/cntk/python

ENV PATH=/cntk/cntk/bin:$PATH \
    LD_LIBRARY_PATH=/cntk/cntk/lib:/cntk/cntk/dependencies/lib:$LD_LIBRARY_PATH

WORKDIR /root
DongZhaoYu commented 6 years ago

from the error, E: The method driver /usr/lib/apt/methods/https could not be found. you need to install apt-transport-https in your image please try apt-get install apt-transport-https before you install hdfs-mount

qyyy commented 6 years ago

It seems inconvenient to install the hadoop-hdfs-fuse. After I adding apt-get install apt-transport-https, error still occurs:

Step 1/10 : FROM pai.build.mpi:openmpi1.10.4-hadoop2.7.2-cuda8.0-cudnn6-devel-ubuntu16.04
 ---> 4eb5f95d4d72
Step 2/10 : ENV CNTK_VERSION=2.0.beta11.0
 ---> Using cache
 ---> 6d02281d271e
Step 3/10 : RUN apt-get -y update &&     apt-get -y install git         fuse         golang         libjasper1         libjpeg8         libpng12-0         libgfortran3                 apt-transport-https &&     apt-get clean &&     rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 8d6512c8a735
Step 4/10 : WORKDIR /
 ---> Using cache
 ---> 398f5df1fb64
Step 5/10 : RUN wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb &&     dpkg -i cdh5-repository_1.0_all.deb &&  apt-get -y update && apt-get install hadoop-hdfs-fuse
 ---> Running in de6a4afbbdfd
--2018-10-25 05:30:54--  http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb
Resolving archive.cloudera.com (archive.cloudera.com)... 151.101.72.167
Connecting to archive.cloudera.com (archive.cloudera.com)|151.101.72.167|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3508 (3.4K) [application/x-debian-package]
Saving to: 'cdh5-repository_1.0_all.deb'

     0K ...                                                   100%  326M=0s

2018-10-25 05:30:54 (326 MB/s) - 'cdh5-repository_1.0_all.deb' saved [3508/3508]

Selecting previously unselected package cdh5-repository.
(Reading database ... 28365 files and directories currently installed.)
Preparing to unpack cdh5-repository_1.0_all.deb ...
Unpacking cdh5-repository (1.0) ...
Setting up cdh5-repository (1.0) ...
gpg: keyring `/etc/apt/secring.gpg' created
gpg: keyring `/etc/apt/trusted.gpg.d/cloudera-cdh5.gpg' created
gpg: /etc/apt/trustdb.gpg: trustdb created
gpg: key 02A818DD: public key "Cloudera Apt Repository" imported
gpg: Total number processed: 1
gpg:               imported: 1
Get:1 https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5 InRelease [1931 B]
Ign:2 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  InRelease
Ign:1 https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5 InRelease
Get:3 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
Get:4 http://security.ubuntu.com/ubuntu xenial-security InRelease [107 kB]
Get:5 https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5/contrib Sources [11.6 kB]
Ign:6 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  InRelease
Get:7 https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5/contrib amd64 Packages [27.5 kB]
Get:8 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  Release [564 B]
Get:9 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Release [564 B]
Get:10 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  Release.gpg [819 B]
Get:11 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Release.gpg [801 B]
Get:12 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  Packages [156 kB]
Get:13 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Packages [31.9 kB]
Get:14 http://security.ubuntu.com/ubuntu xenial-security/universe Sources [97.2 kB]
Get:15 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
Get:16 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages [732 kB]
Get:17 http://archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]
Get:18 http://archive.ubuntu.com/ubuntu xenial/universe Sources [9802 kB]
Get:19 http://security.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages [12.7 kB]
Get:20 http://security.ubuntu.com/ubuntu xenial-security/universe amd64 Packages [498 kB]
Get:21 http://security.ubuntu.com/ubuntu xenial-security/multiverse amd64 Packages [3747 B]
Get:22 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages [1558 kB]
Get:23 http://archive.ubuntu.com/ubuntu xenial/restricted amd64 Packages [14.1 kB]
Get:24 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages [9827 kB]
Get:25 http://archive.ubuntu.com/ubuntu xenial/multiverse amd64 Packages [176 kB]
Get:26 http://archive.ubuntu.com/ubuntu xenial-updates/universe Sources [283 kB]
Get:27 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages [1122 kB]
Get:28 http://archive.ubuntu.com/ubuntu xenial-updates/restricted amd64 Packages [13.1 kB]
Get:29 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages [897 kB]
Get:30 http://archive.ubuntu.com/ubuntu xenial-updates/multiverse amd64 Packages [18.8 kB]
Get:31 http://archive.ubuntu.com/ubuntu xenial-backports/main amd64 Packages [7965 B]
Get:32 http://archive.ubuntu.com/ubuntu xenial-backports/universe amd64 Packages [8532 B]
Fetched 25.9 MB in 7s (3289 kB/s)
Reading package lists...
W: https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh/dists/trusty-cdh5/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: GPG error: https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5 InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 327574EE02A818DD
W: The repository 'https://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5 InRelease' is not signed.
W: http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/Release.gpg: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/Release.gpg: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://security.ubuntu.com/ubuntu/dists/xenial-security/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
W: http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/cloudera-cdh5.gpg are ignored as the file is not readable by user '_apt' executing apt-key.
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  avro-libs bigtop-jsvc bigtop-utils hadoop hadoop-0.20-mapreduce
  hadoop-client hadoop-hdfs hadoop-mapreduce hadoop-yarn libhdfs0
  netcat-openbsd parquet parquet-format psmisc zookeeper
The following NEW packages will be installed:
  avro-libs bigtop-jsvc bigtop-utils hadoop hadoop-0.20-mapreduce
  hadoop-client hadoop-hdfs hadoop-hdfs-fuse hadoop-mapreduce hadoop-yarn
  libhdfs0 netcat-openbsd parquet parquet-format psmisc zookeeper
0 upgraded, 16 newly installed, 0 to remove and 85 not upgraded.
Need to get 445 MB of archives.
After this operation, 516 MB of additional disk space will be used.
Do you want to continue? [Y/n] Abort.
The command '/bin/sh -c wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb &&    dpkg -i cdh5-repository_1.0_all.deb &&   apt-get -y update && apt-get install hadoop-hdfs-fuse' returned a non-zero code: 1

@DongZhaoYu can you give me an example to install it from a clean environment? Thanks!

DongZhaoYu commented 5 years ago

From the error the install command aborted. use apt-get -y --allow-unauthenticated install hadoop-hdfs-fuse when installing hadoop-hdfs-fuse.

DongZhaoYu commented 5 years ago

This is the dockerfile I tried and it works. You can add other commands to this file to build your image.

FROM pai.build.base:hadoop2.7.2-cuda8.0-cudnn6-devel-ubuntu16.04

ENV CNTK_VERSION=2.0.beta11.0

RUN apt-get -y update && \
    apt-get -y install apt-transport-https && \
    apt-get -y install git \
        fuse \
        golang \
        libjasper1 \
        libjpeg8 \
        libpng12-0 \
        libgfortran3 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /

# Install hdfs-mount
RUN wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb && \
    dpkg -i cdh5-repository_1.0_all.deb && \
    apt-get -y update && apt-get -y --allow-unauthenticated install hadoop-hdfs-fuse
DongZhaoYu commented 5 years ago

The hdfs-fuse tool will write its configuration files to /etc/hadoop. It has conflict with our current setting. Since we mount this path as readonly when the job is started. I have change the path in https://github.com/Microsoft/pai/pull/1609