trying to build provided Dockerfile results with a failure #1

Closed guybartal closed 4 years ago

guybartal commented 5 years ago
C:\Users\gubert\Repos\MLOps_VideoAnomalyDetection\config>docker build -f Dockerfile -t wopauli_1.8-gpu:1 .
Sending build context to Docker daemon  8.704kB
Step 1/2 : FROM
intelmpi2018.3-cuda9.0-cudnn7-ubuntu16.04: Pulling from azureml/base-gpu
7b722c1070cd: Pull complete                                                                                                                                                5fbf74db61f1: Pull complete                                                                                                                                                ed41cb72e5c9: Pull complete                                                                                                                                                7ea47a67709e: Pull complete                                                                                                                                                35400734fa04: Pull complete                                                                                                                                                195acf8a5739: Pull complete                                                                                                                                                127028f911f6: Pull complete                                                                                                                                                84588368cc86: Pull complete                                                                                                                                                decbf3005a1c: Pull complete                                                                                                                                                249412ff35c9: Pull complete                                                                                                                                                3f601dfda46c: Pull complete                                                                                                                                                d481228abde9: Pull complete                                                                                                                                                38567447e6f3: Pull complete                                                                                                                                                1c5715cbc27e: Pull complete                                                                                                                                                9fdb00ca4b90: Pull complete                                                                                                                                                Digest: sha256:fd6c26ca1c5e8aefce47850ebaaea8ae58f2b1516b4530a75fff9d48ffd3a2bb
Status: Downloaded newer image for
 ---> c0ba45f719a0
Step 2/2 : RUN ldconfig /usr/local/cuda/lib64/stubs &&     conda install -y python=3.6.2 && conda clean -ay &&     pip install --no-cache-dir azureml-defaults &&     pip install --no-cache-dir tensorflow==1.8.0 tensorflow-gpu==1.8.0 keras==2.0.8 matplotlib==3.0.3 seaborn==0.9.0 requests==2.21.0 bs4==0.0.1 imageio==2.5.0 sklearn pandas==0.24.2 numpy==1.16.2 hickle==3.4.3 &&     pip install --no-cache-dir horovod==0.13.5 &&     ldconfig
 ---> Running in 70f2deb6967a
Solving environment: ...working... done

==> WARNING: A newer version of conda exists. <==
  current version: 4.5.11
  latest version: 4.7.5

Please update conda by running

    $ conda update -n base -c defaults conda

Traceback (most recent call last):
  File "/opt/miniconda/bin/conda", line 7, in <module>
    from conda.cli import main
ModuleNotFoundError: No module named 'conda'
The command '/bin/sh -c ldconfig /usr/local/cuda/lib64/stubs &&     conda install -y python=3.6.2 && conda clean -ay &&     pip install --no-cache-dir azureml-defaults &&     pip install --no-cache-dir tensorflow==1.8.0 tensorflow-gpu==1.8.0 keras==2.0.8 matplotlib==3.0.3 seaborn==0.9.0 requests==2.21.0 bs4==0.0.1 imageio==2.5.0 sklearn pandas==0.24.2 numpy==1.16.2 hickle==3.4.3 &&     pip install --no-cache-dir horovod==0.13.5 &&     ldconfig' returned a non-zero code: 1
guybartal commented 5 years ago

maybe you should remove "conda clean -ay" from the Dockerfile, not sure what is the purpose of that command, but it causing this problem...

wmpauli commented 5 years ago

Have you tried to update your conda base image as suggested in your log? "conda update -n base -c defaults conda"

Besides that, to me it looks like a problem with the conda installation/configuration. Building the docker image works for me.

"conda clean -ay" ensures that we use the newest version of all packages, in case packages have changed without updating release versions. This can be useful if you are using you are including your on pypi packages.