apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
Apache License 2.0
14.3k stars 3.48k forks source link

[Python] cannot allocate memory in static TLS block exception raised when pyarrow is imported before _mysql #39854

Open mdobrzanski opened 7 months ago

mdobrzanski commented 7 months ago

Describe the bug, including details regarding any error messages, version, and platform.

The mysqlclient (MySQLdb) cannot be imported if pyarrow is imported first. An error is raised ImportError: /lib64/libstdc++.so.6: cannot allocate memory in static TLS block

I have experienced Red Hat and Oracle Linux distributions.

Steps to reproduce:

  1. start docker container from oracle linux9 image docker run -ti oraclelinux:9 bash
  2. install system dependencies
    dnf install -y epel-release 
    dnf config-manager --set-enabled ol9_codeready_builder
    dnf install gcc python3-devel python3-pip

    The libmysql is 8.0.32 from oracle linux repositories but I had the same problem with I used libmysql from mysql community repository .

  3. install python dependencies (pyarrow and mysqlclient aka. MySQLdb)
    pip3 install pyarrow==15.0.0 mysqlclient==2.2.1
  4. mysql import is fine when pyarrow is not earlier imported
    [root@930fd58ceb8a /]# python3 -c 'import MySQLdb; print("OK")'
  5. fails if pyarrow is imported before mysql ImportError: /lib64/libstdc++.so.6: cannot allocate memory in static TLS block
    [root@930fd58ceb8a /]# python3 -c 'import pyarrow; import MySQLdb; print("OK")'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/local/lib64/python3.9/site-packages/MySQLdb/__init__.py", line 17, in <module>
        from . import _mysql
    ImportError: /lib64/libstdc++.so.6: cannot allocate memory in static TLS block

I was probing the installed pyarrow library. The mysql import starts to fail after pyarrow.lib is imported https://github.com/apache/arrow/blob/maint-15.0.0/python/pyarrow/__init__.py#L65

When looking at shared libraries, _mysql, pyarrow.lib as we as libmysqlclient use the same libstdc++.so.6 => /lib64/libstdc++.so.6

[root@930fd58ceb8a /]# ldd /usr/local/lib64/python3.9/site-packages/MySQLdb/_mysql.cpython-39-x86_64-linux-gnu.so 
    linux-vdso.so.1 (0x00007ffda316c000)
    libmysqlclient.so.21 => /usr/lib64/mysql/libmysqlclient.so.21 (0x00007f4c2d672000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f4c2d469000)
    libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f4c2d392000)
    libssl.so.3 => /lib64/libssl.so.3 (0x00007f4c2d2ec000)
    libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007f4c2ceb9000)
    libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f4c2cea5000)
    libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f4c2cc7c000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f4c2cc60000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f4c2dd11000)
    libz.so.1 => /lib64/libz.so.1 (0x00007f4c2cc46000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f4c2cb6b000)
[root@930fd58ceb8a /]# ldd /usr/local/lib64/python3.9/site-packages/pyarrow/lib.cpython-39-x86_64-linux-gnu.so 
    linux-vdso.so.1 (0x00007fff96b91000)
    libarrow_python.so => /usr/local/lib64/python3.9/site-packages/pyarrow/libarrow_python.so (0x00007f1798f6b000)
    libarrow_dataset.so.1500 => /usr/local/lib64/python3.9/site-packages/pyarrow/libarrow_dataset.so.1500 (0x00007f1798da1000)
    libparquet.so.1500 => /usr/local/lib64/python3.9/site-packages/pyarrow/libparquet.so.1500 (0x00007f1798538000)
    libarrow_acero.so.1500 => /usr/local/lib64/python3.9/site-packages/pyarrow/libarrow_acero.so.1500 (0x00007f17983c5000)
    libarrow.so.1500 => /usr/local/lib64/python3.9/site-packages/pyarrow/libarrow.so.1500 (0x00007f17957e0000)
    libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f17955b4000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f17954d9000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f17954bd000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f17952b4000)
    librt.so.1 => /lib64/librt.so.1 (0x00007f17952af000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f17952aa000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007f17952a3000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f17994ed000)
[root@930fd58ceb8a /]# ldd /usr/lib64/mysql/libmysqlclient.so.21
    linux-vdso.so.1 (0x00007ffe5cfaa000)
    libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fb1976c3000)
    libssl.so.3 => /lib64/libssl.so.3 (0x00007fb19761d000)
    libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007fb1971ea000)
    libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fb1971d6000)
    libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fb196faf000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb196f93000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fb196d88000)
    libz.so.1 => /lib64/libz.so.1 (0x00007fb196d6e000)
    libm.so.6 => /lib64/libm.so.6 (0x00007fb196c93000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fb197e28000)

Interestingly importing other way round or setting the LD_PRELOAD=/lib64/libstdc++.so.6 makes it work fine

[root@930fd58ceb8a /]# python3 -c 'import MySQLdb; import pyarrow; print("OK")'

[root@930fd58ceb8a /]# LD_PRELOAD=/lib64/libstdc++.so.6 python3 -c 'import pyarrow; import MySQLdb; print("OK")'

Could you help me understand what is going on and what would be the right fix for that?



xhochy commented 7 months ago

This is related to https://github.com/jemalloc/jemalloc/issues/937 and https://github.com/conda-forge/arrow-cpp-feedstock/issues/636

mdobrzanski commented 7 months ago

@xhochy, that's some advanced stuff, I think I didn't get everything that is written in related issues. What I understood is pyarrow was fixed and it's MySQLdb or even libmysqlclient that has to be fixed. There was an issue with the latter on ubuntu https://bugs.launchpad.net/ubuntu/+source/mysql-8.0/+bug/1889851. Is that right? In your opinion, where the fixes should be done?

xhochy commented 7 months ago

Yes, that sounds related and the fix needs to be done on the MySQLdb side.