ContinuumIO / anaconda-issues

Anaconda issue tracking
646 stars 220 forks source link

PyQt with matplotlib import failure: libgcc_s.so.1 must be installed for pthread_cancel to work #9190

Open rth opened 6 years ago

rth commented 6 years ago

This issue was moved from https://github.com/matplotlib/matplotlib/issues/11058 (see this link for more detailed description)

Actual Behavior

On Debian 9.4, in the following environment,

conda create -n test-env qt==5.9.4 matplotlib==2.2.2 pyqt==5.9.2 python==3.6.5 

importing

import PyQt5.Qt; import matplotlib.pyplot

raises,

libgcc_s.so.1 must be installed for pthread_cancel to work

Importing these packages in orther order or separately works fine.

Expected Behavior

Imports without errors.

Steps to Reproduce

See the docker setup in https://github.com/matplotlib/matplotlib/issues/11058

Anaconda or Miniconda version:

conda 4.4

Operating System:

Debian 9.4 Linux (official continuumio/miniconda:4.4 docker image )

cc @mingwandroid

mingwandroid commented 6 years ago

Thanks. I cannot reproduce this. I have asked others on the AD team to try to reproduce it.

rth commented 6 years ago

FWIW, I can reproduce it on a Debian 9.4 VM

I cannot reproduce it on a Debian 9.4 Docker container (nor on an Ubuntu 14.04 one). What am I missing?

Hmm, not sure, you agree that it happens in the official miniconda docker image on Travis CI (cf links at the end of https://github.com/matplotlib/matplotlib/issues/11058#issuecomment-381912684)? Also saw it in a Debian 9.4 box on EC2.

For more context, I first run into this when using that image for CI on Wercker CI.

Thanks. I cannot reproduce this. I have asked others on the AD team to try to reproduce it.

Thanks for looking into it! Please let me know if there is anything I can do to help..

mingwandroid commented 6 years ago

Hmm, not sure, you agree that it happens in the official miniconda docker image on Travis CI

It appears that way yes, but I don't know exactly what is on Travis CI and I never use it (nor have I heard of Wercker CI) :-( At a guess, they're running some 32-bit graphics drivers or something like that?

I'm unfamiliar with how to investigate things on Travis CI.

mingwandroid commented 6 years ago

If you prefix the command with xvfb-run does the error happen? Can you get Travis to list all packages? Can you show the output from running file against every .so* in the /usr hierarchy?

rth commented 6 years ago

it appears that way yes, but I don't know exactly what is on Travis CI and I never use it (nor have I heard of Wercker CI) :-( At a guess, they're running some 32-bit graphics drivers or something like that?

Well, it's still the same official miniconda docker image, and as far as I understand, it shouldn't matter too much where it is running.. This issue does seem brittle as I wasn't able to reproduce it on another Debian system.

If you prefix the command with xvfb-run does the error happen?

Same thing except that I get a segmentation fault (Aborted (core dumped)) https://travis-ci.org/rth/ci-sandbox/builds/368048816#L888

Can you get Travis to list all packages? Can you show the output from running file against every .so* in the /usr hierarchy?

So it should be the same packages installed in the miniconda docker image, the output of file can be found in https://travis-ci.org/rth/ci-sandbox/builds/368048816#L1803 ; nothing that looks like 32 bit there..

Running, with LD_DEBUG=libs does produce some other errors though (see full outpout at https://travis-ci.org/rth/ci-sandbox/builds/368048816#L895),

Probably unrelated error with MKL,

5:  calling init: /opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/numpy/_mklinit.cpython-36m-x86_64-linux-gnu.so
5:  
5:  /opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/numpy/_mklinit.cpython-36m-x86_64-linux-gnu.so:
 error: symbol lookup error: undefined symbol: omp_get_num_threads (fatal)

and more interstigly,

 5: find library=libfreetype.so.6 [0]; searching
 5:  search path=/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../tls/x86_64:/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../tls:/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../x86_64:/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../..      (RPATH from file /opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/ft2font.cpython-36m-x86_64-linux-gnu.so)
  5:      trying file=/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../tls/x86_64/libfreetype.so.6
  5:      trying file=/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../tls/libfreetype.so.6
  5:      trying file=/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../x86_64/libfreetype.so.6
  5:      trying file=/opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../libfreetype.so.6
  5:    
  5:    calling init: /opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/../../../libfreetype.so.6
  5:    
  5:    calling init: /opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/matplotlib/ft2font.cpython-36m-x86_64-linux-gnu.so
  5:    
  5:    /opt/conda/envs/qt594mpl222/lib/python3.6/site-packages/PyQt5/../../.././libgobject-2.0.so.0:
 error: symbol lookup error: undefined symbol: g_source_destroy (fatal)
libgcc_s.so.1 must be installed for pthread_cancel to work

I checked that that is also a 64bit .so file..

mingwandroid commented 6 years ago

This issue does seem brittle as I wasn't able to reproduce it on another Debian system.

To me these Travis machines seem broken in some subtle way. And docker does not offer the amount of isolation people think it does (clearly, otherwise things would either reproduce reliably or they would not given the same image).

nm -g /opt/conda/envs/qt594mpl222/lib/libglib-2.0.so.0 | grep g_source_destroy
000000000004a330 T g_source_destroy
readelf -d /opt/conda/envs/qt594mpl222/lib/libgobject-2.0.so.0

Dynamic segment contains 29 entries:
 Addr: 0x0000000000253840  Offset: 0x053840  Link to section: [ 3] '.dynstr'
  Type              Value
  NEEDED            Shared library: [libglib-2.0.so.0]
  NEEDED            Shared library: [libffi.so.6]
  NEEDED            Shared library: [libc.so.6]
  SONAME            Library soname: [libgobject-2.0.so.0]
  RPATH             Library rpath: [$ORIGIN/.]
  INIT              0x000000000000ac70
  FINI              0x000000000003d924
  INIT_ARRAY        0x0000000000253358
  INIT_ARRAYSZ      8 (bytes)
  HASH              0x0000000000000190
  STRTAB            0x0000000000005530
  SYMTAB            0x0000000000001ae0
  STRSZ             13612 (bytes)
  SYMENT            24 (bytes)
  PLTGOT            0x0000000000253a10
  RELA              0x0000000000008f78
  RELASZ            7416 (bytes)
  RELAENT           24 (bytes)
  BIND_NOW
  FLAGS_1           NOW NODELETE
  VERNEED           0x0000000000008f38
  VERNEEDNUM        1
  VERSYM            0x0000000000008a5c
  RELACOUNT         128
  NULL
  NULL
  NULL
  NULL
  NULL

That looks fine so I still have no idea about this.

rth commented 6 years ago

Yes, it's a bit strange, I guess we'll see of other people run into this.. Thanks for investigating!

nehaljwani commented 6 years ago

I am able to reproduce this.

root@b70520424601:/# /opt/conda/envs/test-env/bin/python -c "import PyQt5.Qt; import matplotlib.pyplot"
libgcc_s.so.1 must be installed for pthread_cancel to work
Aborted (core dumped)
nehaljwani commented 6 years ago

Source of problem:

        27:     relocation processing: /opt/conda/envs/test-env/lib/python3.6/site-packages/PyQt5/../../.././libgobject-2.0.so.0
        27:     symbol=g_source_destroy;  lookup in file=/opt/conda/envs/test-env/bin/python [0]
        27:     symbol=g_source_destroy;  lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
        27:     symbol=g_source_destroy;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
        27:     symbol=g_source_destroy;  lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
        27:     symbol=g_source_destroy;  lookup in file=/lib/x86_64-linux-gnu/libutil.so.1 [0]
        27:     symbol=g_source_destroy;  lookup in file=/lib/x86_64-linux-gnu/librt.so.1 [0]
        27:     symbol=g_source_destroy;  lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
        27:     symbol=g_source_destroy;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
        27:     symbol=g_source_destroy;  lookup in file=/opt/conda/envs/test-env/lib/python3.6/site-packages/numpy/_mklinit.cpython-36m-x86_64-linux-gnu.so [0]
        27:     symbol=g_source_destroy;  lookup in file=/opt/conda/envs/test-env/lib/python3.6/site-packages/numpy/core/../../../../libmkl_rt.so [0]
        27:     symbol=g_source_destroy;  lookup in file=/opt/conda/envs/test-env/lib/python3.6/site-packages/numpy/../../../libiomp5.so [0]
        27:     symbol=g_source_destroy;  lookup in file=/opt/conda/envs/test-env/lib/python3.6/site-packages/PyQt5/../../../libgcc_s.so.1 [0]
        27:     /opt/conda/envs/test-env/lib/python3.6/site-packages/PyQt5/../../.././libgobject-2.0.so.0: error: symbol lookup error: undefined symbol: g_source_destroy (fatal)
        27:     symbol=__libc_fatal;  lookup in file=/opt/conda/envs/test-env/bin/python [0]
        27:     symbol=__libc_fatal;  lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
        27:     symbol=__libc_fatal;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
        27:     binding file /lib/x86_64-linux-gnu/libpthread.so.0 [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `__libc_fatal' [GLIBC_PRIVATE]

Out is different from what Ray would expect it to be:

root@b70520424601:/# nm -A /opt/conda/envs/test-env/lib/python3.6/site-packages/PyQt5/../../.././libgobject-2.0.so.0 | grep g_source_destroy
/opt/conda/envs/test-env/lib/python3.6/site-packages/PyQt5/../../.././libgobject-2.0.so.0:                 U g_source_destroy
mingwandroid commented 6 years ago

That output is fine, the symbol should be resolved in libglib-2.0.so.0 (see my comment).

Which version of glib has been installed here? conda list --show-channel-urls please.

nehaljwani commented 6 years ago

Version of glib installed is glib-2.53.6-h5d9569c_2

nehaljwani commented 6 years ago

Complete output:

root@b70520424601:/# conda list -n test-env --show-channel-urls
# packages in environment at /opt/conda/envs/test-env:
#
# Name                    Version                   Build  Channel
ca-certificates           2018.03.07                    0    defaults
certifi                   2018.1.18                py36_0    defaults
cycler                    0.10.0           py36h93f1223_0    defaults
dbus                      1.13.2               hc3f9b76_0    defaults
expat                     2.2.5                he0dffb1_0    defaults
fontconfig                2.12.6               h49f89f6_0    defaults
freetype                  2.8                  hab7d2ae_1    defaults
glib                      2.53.6               h5d9569c_2    defaults
gst-plugins-base          1.12.4               h33fb286_0    defaults
gstreamer                 1.12.4               hb53b477_0    defaults
icu                       58.2                 h9c2bf20_1    defaults
intel-openmp              2018.0.0                      8    defaults
jpeg                      9b                   h024ee3a_2    defaults
kiwisolver                1.0.1            py36h764f252_0    defaults
libedit                   3.1                  heed3624_0    defaults
libffi                    3.2.1                hd88cf55_4    defaults
libgcc-ng                 7.2.0                hdf63c60_3    defaults
libgfortran-ng            7.2.0                hdf63c60_3    defaults
libpng                    1.6.34               hb9fc6fc_0    defaults
libstdcxx-ng              7.2.0                hdf63c60_3    defaults
libxcb                    1.13                 h1bed415_1    defaults
libxml2                   2.9.8                hf84eae3_0    defaults
matplotlib                2.2.2            py36h0e671d2_1    defaults
mkl                       2018.0.2                      1    defaults
mkl_fft                   1.0.1            py36h3010b51_0    defaults
mkl_random                1.0.1            py36h629b387_0    defaults
ncurses                   6.0                  h9df7e31_2    defaults
numpy                     1.14.2           py36hdbf6ddf_1    defaults
openssl                   1.0.2o               h20670df_0    defaults
pcre                      8.42                 h439df22_0    defaults
pip                       9.0.3                    py36_0    defaults
pyparsing                 2.2.0            py36hee85983_1    defaults
pyqt                      5.9.2            py36h751905a_0    defaults
python                    3.6.5                hc3d631a_0    defaults
python-dateutil           2.7.2                    py36_0    defaults
pytz                      2018.4                   py36_0    defaults
qt                        5.9.4                h4e5bff0_0    defaults
readline                  7.0                  ha6073c6_4    defaults
setuptools                39.0.1                   py36_0    defaults
sip                       4.19.8           py36hf484d3e_0    defaults
six                       1.11.0           py36h372c433_1    defaults
sqlite                    3.22.0               h1bed415_0    defaults
tk                        8.6.7                hc745277_3    defaults
tornado                   5.0.1                    py36_1    defaults
wheel                     0.31.0                   py36_0    defaults
xz                        5.2.3                h55aa19d_2    defaults
zlib                      1.2.11               ha838bed_2    defaults
mingwandroid commented 6 years ago

can you try to update it to 2.56.1?

nehaljwani commented 6 years ago

That updates qt. OP wants qt 5.9.4

mingwandroid commented 6 years ago

No, I don't think anyone wants Qt 5.9.4 vs Qt 5.9.5 (which was released since this issue was opened or around the same time).

mingwandroid commented 6 years ago

5.9.5 is a bugfix release which is fully binary compatible, so why would anyone not want that?

nehaljwani commented 6 years ago

Upgrading glib and qt doesn't help. Same error. LD_DEBUG=all output attached for both. glib2.53.6.txt.gz glib2.56.1.txt.gz

nehaljwani commented 6 years ago

I compiled the latest glibc and then ran it, the problem went away. I'll probably run a bisect later to find out when and where it might have been fixed, but here is what I did (inside the container):

apt-get install gcc bison gawk make
git clone git://sourceware.org/git/glibc.git; cd glibc
mkdir build; cd $_
../configure --disable-sanity-checks
make -jX
sed -i 's/math:/math:"${builddir}"\/login:/' testrun.sh # Because our python needs libutil.so.1
sed -i 's/math:/math:\/usr\/lib\/x86_64-linux-gnu\/:/' testrun.sh # Because libGL.so.1 lives there.

And then:

root@159a1e58d27c:/glibc/build# ./testrun.sh /opt/conda/envs/test-env/bin/python -c "import PyQt5.Qt; import matplotlib.pyplot"
root@159a1e58d27c:/glibc/build# echo $?
0
mingwandroid commented 6 years ago

Interesting. Thanks Nehal.

mingwandroid commented 6 years ago

@nehaljwani, can you please paste the output from:

apt-file search libpthread.so.0
nm -g /lib/x86_64-linux-gnu/libpthread.so.0 | grep GLIBC_2.14
dpkg --list | grep libc6
mingwandroid commented 6 years ago

I still (obviously) cannot reproduce this.

I have tried numerous docker containers on macOS:

centos6, centos7, debian 7.4, debian 9.4, ubuntu 12.02, ubuntu 14.04 .. a Debian 7 VM running on Vagrant/VirtualBox .. and on SUSE running on VMWare.

There's some very suspicious differences to the symbols resolved from libpthread.so.0 in @nehaljwani's logs versus my own which I am guessing is down to installing docker and that needing a newer (but unfortunately probably subtly broken) glibc but I am not sure about that (I suspect it is broken because Nehal's logs both contain: WARNING: Unsupported flag value(s) of 0x8000000 in DT_FLAGS_1. while mine does not).

Can someone give me clear, working instructions for how to repro this starting from nothing. I don't care upon what technology (though I'd prefer not to have to touch any 3rd-party cloud-based CI systems if at all possible).

Also, can someone try changing the miniconda docker to be based upon centos:6 instead? That would seem like a very sensible thing to do in general.

nehaljwani commented 6 years ago

Requested output:

root@453cdb10feeb:/# apt-file search libpthread.so.0
libc6: /lib/x86_64-linux-gnu/libpthread.so.0
libc6-i386: /lib32/libpthread.so.0
libc6-x32: /libx32/libpthread.so.0

root@453cdb10feeb:/# nm -g /lib/x86_64-linux-gnu/libpthread.so.0 | grep GLIBC_2.14
                 U memcpy@@GLIBC_2.14

root@453cdb10feeb:/# dpkg --list | grep libc6
ii  libc6:amd64                2.19-18+deb8u10                  amd64        GNU C Library: Shared libraries

I reproduced this issue on a CentOS 7 machine with Docker version 17.09.0-ce, build afdb6d4

For changing the docker base image, we'll have to raise an issue at https://github.com/ContinuumIO/docker-images

mingwandroid commented 6 years ago

Hi @nehaljwani, this is still not causing this to reproduce for me.

I launch docker from macOS with the following command:

docker run -it -v /opt/conda-linux-64:/opt/conda -v /opt/conda-linux-32:/opt/conda-32 -v ${HOME}:/root -it centos:7 /bin/bash
yum install mesa-libGL.x86_64
python -c "import PyQt5.Qt; import matplotlib.pyplot"

.. no problem. Some details:

uname -a
Linux 9e15d03656f8 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
mingwandroid commented 6 years ago

Also you said you are running this from Docker on CentOS7, which docker image do you use?

nehaljwani commented 6 years ago

@mingwandroid I was able to reproduce this on two different machines now. One running CentOS7 and another running Ubuntu 16.04.3

Docker version on CentOS box: 17.09.0-ce (kernel version 3.10.0-693.5.2) Docker version on Ubuntu box: 17.05.0-ce (kernel version 4.4.0-112)

Dockerfile used:

FROM continuumio/miniconda:4.4.10
RUN apt-get install -y xvfb
RUN conda create -n test-env qt==5.9.4 matplotlib==2.2.2 pyqt==5.9.2 python==3.6.5 

Command to reproduce:

docker build -t test-pyqt .
docker run --rm test-pyqt:latest \
      /opt/conda/envs/test-env/bin/python -c "import PyQt5.Qt; import matplotlib.pyplot"
nehaljwani commented 6 years ago

FWIW, if I do the following (inside the container):

apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6

The import succeeds.

astrofrog commented 6 years ago

I've narrowed it down to libxss1. I get the error described here unless libxss1 is installed. The other libraries mentioned above are needed too, but this one is the one I was missing in my particular case.

acrosby commented 5 years ago

@mingwandroid I'm not sure that mounting your system /opt directories into the docker container is best practice to test this out.

mingwandroid commented 5 years ago

They aren't 'system' opt folders. They're my opt folders and I know that what's in them is appropriate to share in my docker containers, but thank you for your concern!

joshuacwnewton commented 2 years ago

I encountered this issue in a project I'm working on (https://github.com/spinalcordtoolbox/spinalcordtoolbox/issues/3511). I found that swapping the import order for PyQt and matplotlib.pyplot did indeed fix the issue (as mentioned in the issue description).

Notably, this error started occurring for us under these specific conditions:

SalahEddineGhamri commented 2 years ago

FWIW, if I do the following (inside the container):

apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6

The import succeeds.

solves the issue thanks