numpy / numpy

The fundamental package for scientific computing with Python.
https://numpy.org
Other
27.6k stars 9.91k forks source link

ImportError: PyCapsule_Import could not import module "datetime" #14474

Closed cmosig closed 2 years ago

cmosig commented 5 years ago

Reproducing code example:

import matplotlib.pyplot as plt;

Before this happened I was running a different python script, which used 89 processes. Unfortunately I cannot share that script publicly. Since then numpy crashes after import immediately.

Error message:

OpenBLAS blas_thread_init: pthread_create failed for thread 52 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 53 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 54 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 55 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 56 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 57 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 58 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 59 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 60 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 61 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 62 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
OpenBLAS blas_thread_init: pthread_create failed for thread 63 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
Traceback (most recent call last):
  File "/usr/local/lib64/python3.6/site-packages/numpy/core/__init__.py", line 17, in <module>
    from . import multiarray
  File "/usr/local/lib64/python3.6/site-packages/numpy/core/multiarray.py", line 14, in <module>
    from . import overrides
  File "/usr/local/lib64/python3.6/site-packages/numpy/core/overrides.py", line 7, in <module>
    from numpy.core._multiarray_umath import (
ImportError: PyCapsule_Import could not import module "datetime"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "temp.py", line 1, in <module>
    import matplotlib.pyplot as plt; 
  File "/usr/local/lib64/python3.6/site-packages/matplotlib/__init__.py", line 138, in <module>
    from . import cbook, rcsetup
  File "/usr/local/lib64/python3.6/site-packages/matplotlib/cbook/__init__.py", line 31, in <module>
    import numpy as np
  File "/usr/local/lib64/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
    from . import core
  File "/usr/local/lib64/python3.6/site-packages/numpy/core/__init__.py", line 47, in <module>
    raise ImportError(msg)
ImportError: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy c-extensions failed.
- Try uninstalling and reinstalling numpy.
- If you have already done that, then:
  1. Check that you expected to use Python3.6 from "/usr/bin/python3",
     and that you have no directories in your PATH or PYTHONPATH that can
     interfere with the Python and numpy version "1.17.2" you're trying to use.
  2. If (1) looks fine, you can open a new issue at
     https://github.com/numpy/numpy/issues.  Please include details on:
     - how you installed Python
     - how you installed numpy
     - your operating system
     - whether or not you have multiple versions of Python installed
     - if you built from source, your compiler versions and ideally a build log

- If you're working with a numpy git repository, try `git clean -xdf`
  (removes all files not under version control) and rebuild numpy.

Note: this error has many possible causes, so please don't comment on
an existing issue about this - open a new one instead.

Original error was: PyCapsule_Import could not import module "datetime"

zsh: segmentation fault  python3 temp.py

Numpy/Python version information:

numpy-1.7.1-13.el7.x86_64

charris commented 5 years ago

NumPy 1.17.1, correct? What python version/platform. The problematic call looks to be from the cpython macro PyDateTime_IMPORT in datetime.c. Did this used to work?

cmosig commented 5 years ago

Yes correct numpy version. Python version 3.6.8. Yes this used to work before. Also the linux version:

cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

I hope it helps. I there a way I could temporarly fix this problem?

mattip commented 5 years ago

What happens if you import matplotlib once in the process before starting your threads?

mattip commented 5 years ago

Or maybe import datetime; import matplotlib

mattip commented 5 years ago

As far as I can tell, this happens in numpy_pydatetime_import when we call CPython's PyDateTime_IMPORT. We call this quite late in the initialization code, in init_multiarray_umath, which according to your log happens after OpenBLAS opens its threadpool. Perhaps moving the call earlier in the initialization code would help.

This might be due to a faulty python installation where datetime is somehow broken, so making sure import datetime works would be a first step

cmosig commented 5 years ago

Without changing anything. I have run the actual script again and it did not fail immediatly. Instead I received only this error again:

OpenBLAS blas_thread_init: pthread_create failed for thread 60 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1028736 max
...

At the moment I cannot recreate the state where it would fail immediately with the exeception in the first comment. I assume the problem is something dynamic controlled by the os?

As far as I can tell, this happens in numpy_pydatetime_import when we call CPython's PyDateTime_IMPORT. We call this quite late in the initialization code, in init_multiarray_umath, which according to your log happens after OpenBLAS opens its threadpool. Perhaps moving the call earlier in the initialization code would help.

This might be due to a faulty python installation where datetime is somehow broken, so making sure import datetime works would be a first step

Or maybe import datetime; import matplotlib

All these imports are currently working fine.

mattip commented 5 years ago

You may be saturating your machine by running many processes. Importing numpy in each process will open a thread pool with the number of threads equal to number of CPUs, so if you have 64 cpus and open 100 processes that will be thousands of threads. You might be interested in threadpoolctl to manage the number of threads each process opens

cmosig commented 5 years ago

Hmm okay that makes sense. In my case this would be 88 process * 88 cpus = 7744 threads. I tried limiting the number of threads per cpu to one, but unfortunately this did not work. I received the same errors again and in the end I ended up in the state where numpy would immediately crash after import.

I am connected via ssh to a server where I run this code. What I noticed is that when closing the ssh connection and then connecting again, numpy does not crash immediately after import.

mattip commented 5 years ago

I tried limiting the number of threads per cpu to one, but unfortunately this did not work

Did you use threadpoolctl? Perhaps you could continue this part of the issue with them

HariharasudhanAS commented 3 years ago

I had the same problem using Jobe - a sandbox for running code. What worked was inserting these lines in the script before importing numpy.

import os

os.environ['OPENBLAS_NUM_THREADS'] = '1'

So yeah, looks like limiting the number of thread works.

rgommers commented 2 years ago

Duplicate of https://github.com/numpy/numpy/issues/19145 and https://github.com/numpy/numpy/issues/17856. It has gotten better since this bug report. The conclusion was that there's isn't much more that OpenBLAS can do easily. I'll close this issue as a duplicate.

chongchonghe commented 2 years ago

Still has this problem in 2022 with Python 3.9.0 and numpy 1.22.3 . Adding export OPENBLAS_NUM_THREADS=1 to .bashrc seems to solve the problem.

mattip commented 2 years ago

This issue is closed. Please open a new one with the error message. Is there anything out of the ordinary about this machine: does it have a large number of CPUs or is it lacking memory?

chongchonghe commented 2 years ago

Yes, it is on an HPC with hundreds of nodes and 128 cores on each node. Do you mean export OPENBLAS_NUM_THREADS=1 is required in such a situation?

mattip commented 2 years ago

Hundred of nodes and 128 cores on each node is not a configuration we can test on, so stock NumPy (or OpenBLAS) will need some guidance on resource allocation. This may have improved in the 1.23 releases, but ultimately some strategy to allocate resources will be needed.

vlizanae commented 1 year ago

Does OPENBLAS_NUM_THREADS=1 have a heavy influence in performance?

btw still having this issue in a single node with 40 cores, python 3.11.4 + numpy 1.25.1, should I open an issue?

vlizanae commented 1 year ago

Nevermind it was an issue with Docker (seccomp not allowing pthread_create).

zengqingfu1442 commented 10 months ago

(seccomp not allowing pthread_create

It was caused by libseccomp? Is there any related links about it?

vlizanae commented 9 months ago

I found the workaround in a bit of an unrelated conversation (https://github.com/HumanSignal/label-studio/issues/3070), the things is that apparently this does not affect current versions of Docker so I couldn't find out more information about this at the time.

zengqingfu1442 commented 9 months ago

I found the workaround in a bit of an unrelated conversation (HumanSignal/label-studio#3070), the things is that apparently this does not affect current versions of Docker so I couldn't find out more information about this at the time.

I use docker run --security-opt seccomp=unconfined workaround, but it is not recommended in production env.