ContinuumIO / anaconda-issues

Anaconda issue tracking
648 stars 223 forks source link

miniconda3 on s390x "illegal instruction" error on install/docker build #13039

Closed ccorley closed 2 years ago

ccorley commented 2 years ago

Checklist

What happened?

Building a docker image based on continuumio/miniconda3:4.12.0 results in an "Illegal instruction" crash during a docker build on an LinuxONE (s390x) VM running RHEL 8.4. Also tested with 4.10.3 and 4.11.0 with the same results.

To reproduce, create a Dockerfile with just 1 line on an s390x VM running RHEL 8.4: FROM continuumio/miniconda3:4.12.0 Then run docker build . Then run docker run -it <imageid> sh In the image, run conda --version, pip --version, python -c "print(Test)" These all result in an "Illegal instruction".

Building a new base image around an s390x installer such as: Miniconda3-py39_4.11.0-Linux-s390x.sh also results in an illegal instruction crash on an s390x VM running RHEL 8.4. Use this Dockerfile to reproduce, but replace the installer script with the s390x installer script. This fails with an "Illegal instruction" during build. Tested with installers for 4.10.3, 4.11.0 and 4.12.0.

Error:

Step 5/6 : RUN wget --quiet https://repo.continuum.io/miniconda/Miniconda3-py39_4.12.0-Linux-s390x.sh -O ~/miniconda.sh &&  bash ~/miniconda.sh -b -p /opt/conda &&     rm ~/miniconda.sh
 ---> Running in 47d9a6c6b542
PREFIX=/opt/conda
Unpacking payload ...
/root/miniconda.sh: line 411:    28 Exit 141                { dd if="$THIS_PATH" bs=1 skip=15570915 count=10269 2> /dev/null; dd if="$THIS_PATH" bs=16384 skip=951 count=3478 2> /dev/null; dd if="$THIS_PATH" bs=1 skip=72564736 count=16176 2> /dev/null; }
        29 Illegal instruction     (core dumped) | "$CONDA_EXEC" constructor --extract-tar --prefix "$PREFIX"
/root/miniconda.sh: line 413:    34 Illegal instruction     (core dumped) "$CONDA_EXEC" constructor --prefix "$PREFIX" --extract-conda-pkgs
The command '/bin/sh -c wget --quiet https://repo.continuum.io/miniconda/Miniconda3-py39_4.12.0-Linux-s390x.sh -O ~/miniconda.sh &&     bash ~/miniconda.sh -b -p /opt/conda &&     rm ~/miniconda.sh' returned a non-zero code: 1

Core dump:

Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel: User process fault: interruption code 0001 ilc:2 in libpython3.8.so.1.0[3ff85d00000+393000]
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel: CPU: 0 PID: 10549 Comm: conda.exe Kdump: loaded Not tainted 4.18.0-305.el8.s390x #1
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel: Hardware name: IBM 2964 NC9 7A5 (KVM/Linux)
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel: User PSW : 0705000180000000 000003ff85df8cbc
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel:            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:0 PM:0 RI:0 EA:3
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel: User GPRS: cbd66125d842255b 000003ff85fc0728 0000000000000008 000003ff864ba030
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel:            5555555555555556 000003ff864ba030 0000000000000000 0000000000000010
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel:            000003ff864ba030 000003ff864ba030 000003ff864b9080 000003ff864b9080
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel:            000003ff864a6f88 000003ff85fc0728 000003ff85df97ec 000003ffe317ba88
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel: User Code: 000003ff85df8cac: e34010000004        lg        %r4,0(%r1)
                                                                000003ff85df8cb2: eb67003f000a        srag        %r6,%r7,63
                                                               #000003ff85df8cb8: b9ec4047                mgrk        %r4,%r7,%r4
                                                               >000003ff85df8cbc: a71900ff                lghi        %r1,255
                                                                000003ff85df8cc0: b9e96064                sgrk        %r6,%r4,%r6
                                                                000003ff85df8cc4: ebb60001000d        sllg        %r11,%r6,1
                                                                000003ff85df8cca: b90800b6                agr        %r11,%r6
                                                                000003ff85df8cce: ebbb0003000d        sllg        %r11,%r11,3
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel: Last Breaking-Event-Address:
Sep 09 13:51:57 rhel-8.4.zdalisv.dfw.ibm.com kernel:  [<000003ff85df97e6>] 0x3ff85df97e6

Running the builds as root. Also, adding 'ENV TMP /tmp' to the Dockerfile does not help.

Conda info

I can't run `conda info` because `conda` results in an illegal instruction.

Conda config

`conda` results in an illegal instruction

Conda list

`conda` results in an illegal instruction

Additional Context

To obtain a free LinuxONE VM, see the 1st step here and use rhel8 as the OS: https://gist.github.com/timroster/6e34ffeee5e63020c6529da08248af9b

sumit0190 commented 2 years ago

Unless I am trying to reproduce this incorrectly, I can't reproduce it.

(base) [linux1@testpython ~]$ uname -a
Linux testpython 4.18.0-305.12.1.el8_4.s390x #1 SMP Mon Jul 26 07:40:30 EDT 2021 s390x s390x s390x GNU/Linux
(base) [linux1@testpython ~]$ cat Dockerfile
FROM continuumio/miniconda3:4.12.0
(base) [linux1@testpython ~]$ docker build .
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
STEP 1/1: FROM continuumio/miniconda3:4.12.0
Resolved "continuumio/miniconda3" as an alias (/home/linux1/.cache/containers/short-name-aliases.conf)
Trying to pull docker.io/continuumio/miniconda3:4.12.0...
Getting image source signatures
Copying blob 4357873446a2 done
Copying blob f7bb6d2d11c3 done
Copying blob 729b2f9ea9d8 done
Copying config aba15d5f72 done
Writing manifest to image destination
Storing signatures
COMMIT
--> aba15d5f726
aba15d5f7264952b991864ba2ae851f33bf47bd35bac1cab78143b0f66bd128d
(base) [linux1@testpython ~]$ docker run -it aba15d5f7264952b991864ba2ae851f33bf47bd35bac1cab78143b0f66bd128d sh
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
# conda --version
conda 4.12.0
ccorley commented 2 years ago

Thanks for taking the time to test this - this is actually good news. That is how indeed how I would reproduce it, so it may have been a local machine issue, although I uninstalled and reinstalled docker, deleted and reloaded docker images and everything I could think of. On my end, I worked around this issue by basing the container on a python alpine image, but it would have been preferable to use the conda image due to ease of configuration. I am closing this as unreproducible.