theislab / cellrank

CellRank: dynamics from multi-view single-cell data
https://cellrank.org
BSD 3-Clause "New" or "Revised" License
347 stars 46 forks source link

Segfault / python free() invalid size error during compute_transition_matrix #1148

Open KforKuma opened 10 months ago

KforKuma commented 10 months ago

I followed up the tutorial here and with the pancreas exmaple dataset. Every time I went to the line vk.compute_transition_matrix(), python produced a segfault. Sometimes it just corrupted, other times it printed very long output, which is attached below.

import numpy as np
import cellrank as cr
import scanpy as sc
import scvelo as scv
scv.settings.verbosity = 3
cr.settings.verbosity = 2
adata = cr.datasets.pancreas()
scv.pp.filter_and_normalize(
    adata, min_shared_counts=20, n_top_genes=2000, subset_highly_variable=False
)
sc.tl.pca(adata)
sc.pp.neighbors(adata, n_pcs=30, n_neighbors=30, random_state=0)
scv.pp.moments(adata, n_pcs=None, n_neighbors=None)
scv.tl.recover_dynamics(adata, n_jobs=8)
scv.tl.velocity(adata, mode="dynamical")
vk = cr.kernels.VelocityKernel(adata)
vk.compute_transition_matrix()
 0%|                                                                                                                                                                                 | 0/2531 [00:00<?, ?cell/s]*** Error in `python': free(): invalid size: 0x00002b9516eb9c70 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x306ca7d5bd)[0x2b950f2cf5bd]
/lib64/libpthread.so.0(pthread_create+0x26a)[0x2b950e72b2ea]
python(PyThread_start_new_thread+0xa3)[0x2b950e33b3c3]
python(+0x24280a)[0x2b950e35d80a]
python(+0x14bb38)[0x2b950e266b38]
python(_PyObject_MakeTpCall+0x152)[0x2b950e25da62]
python(_PyEval_EvalFrameDefault+0x4dea)[0x2b950e30385a]
python(_PyFunction_Vectorcall+0x25d)[0x2b950e2e11bd]
python(_PyEval_EvalFrameDefault+0x60b)[0x2b950e2ff07b]
python(+0x1c51b9)[0x2b950e2e01b9]
python(+0x1c7ec4)[0x2b950e2e2ec4]
python(PyObject_Call+0x1aa)[0x2b950e26411a]
python(_PyEval_EvalFrameDefault+0x2c0b)[0x2b950e30167b]
python(_PyFunction_Vectorcall+0x25d)[0x2b950e2e11bd]
python(_PyEval_EvalFrameDefault+0x60b)[0x2b950e2ff07b]
python(_PyFunction_Vectorcall+0x25d)[0x2b950e2e11bd]
python(_PyEval_EvalFrameDefault+0x60b)[0x2b950e2ff07b]
python(+0x1c51b9)[0x2b950e2e01b9]
python(+0x1c7ec4)[0x2b950e2e2ec4]
python(PyObject_Call+0x1aa)[0x2b950e26411a]
python(+0x279d05)[0x2b950e394d05]
python(+0x215dd4)[0x2b950e330dd4]
/lib64/libpthread.so.0(+0x306d207df3)[0x2b950e72adf3]
/lib64/libc.so.6(clone+0x6d)[0x2b950f3482cd]
======= Memory map: ========
479d4000000-47a14000000 rw-p 00000000 00:00 0 
2b950e11b000-2b950e175000 r--p 00000000 00:14 7296965889                 /data/HeLab/miniconda3/envs/scvpy10/bin/python3.10
2b950e175000-2b950e3cc000 r-xp 0005a000 00:14 7296965889                 /data/HeLab/miniconda3/envs/scvpy10/bin/python3.10
2b950e3cc000-2b950e4c3000 r--p 002b1000 00:14 7296965889                 /data/HeLab/miniconda3/envs/scvpy10/bin/python3.10
2b950e4c3000-2b950e4c8000 r--p 003a7000 00:14 7296965889                 /data/HeLab/miniconda3/envs/scvpy10/bin/python3.10
2b950e4c8000-2b950e4fa000 rw-p 003ac000 00:14 7296965889                 /data/HeLab/miniconda3/envs/scvpy10/bin/python3.10
2b950e4fa000-2b950e500000 rw-p 00000000 00:00 0 
2b950e500000-2b950e521000 r-xp 00000000 08:01 40                         /lib64/ld-2.17.so
2b950e521000-2b950e522000 rw-p 00000000 00:00 0 
2b950e522000-2b950e523000 rw-p 00000000 00:00 0 
2b950e720000-2b950e721000 r--p 00020000 08:01 40                         /lib64/ld-2.17.so
2b950e721000-2b950e722000 rw-p 00021000 08:01 40                         /lib64/ld-2.17.so
2b950e722000-2b950e723000 rw-p 00000000 00:00 0 
2b950e723000-2b950e739000 r-xp 00000000 08:01 78                         /lib64/libpthread-2.17.so
2b950e739000-2b950e939000 ---p 00016000 08:01 78                         /lib64/libpthread-2.17.so
2b950e939000-2b950e93a000 r--p 00016000 08:01 78                         /lib64/libpthread-2.17.so
2b950e93a000-2b950e93b000 rw-p 00017000 08:01 78                         /lib64/libpthread-2.17.so
2b950e93b000-2b950e940000 rw-p 00000000 00:00 0 
2b950e940000-2b950e943000 r-xp 00000000 08:01 56                         /lib64/libdl-2.17.so
2b950e943000-2b950eb42000 ---p 00003000 08:01 56                         /lib64/libdl-2.17.so
2b950eb42000-2b950eb43000 r--p 00002000 08:01 56                         /lib64/libdl-2.17.so
2b950eb43000-2b950eb44000 rw-p 00003000 08:01 56                         /lib64/libdl-2.17.so
2b950eb44000-2b950eb46000 r-xp 00000000 08:01 311                        /lib64/libutil-2.17.so
2b950eb46000-2b950ed45000 ---p 00002000 08:01 311                        /lib64/libutil-2.17.so
2b950ed45000-2b950ed46000 r--p 00001000 08:01 311                        /lib64/libutil-2.17.so
2b950ed46000-2b950ed47000 rw-p 00002000 08:01 311                        /lib64/libutil-2.17.so
2b950ed47000-2b950ed4e000 r-xp 00000000 08:01 114                        /lib64/librt-2.17.so
2b950ed4e000-2b950ef4d000 ---p 00007000 08:01 114                        /lib64/librt-2.17.so
2b950ef4d000-2b950ef4e000 r--p 00006000 08:01 114                        /lib64/librt-2.17.so
2b950ef4e000-2b950ef4f000 rw-p 00007000 08:01 114                        /lib64/librt-2.17.so
2b950ef4f000-2b950ef50000 rw-p 00000000 00:00 0 
2b950ef50000-2b950f051000 r-xp 00000000 08:01 144                        /lib64/libm-2.17.so
2b950f051000-2b950f250000 ---p 00101000 08:01 144                        /lib64/libm-2.17.so
2b950f250000-2b950f251000 r--p 00100000 08:01 144                        /lib64/libm-2.17.so
2b950f251000-2b950f252000 rw-p 00101000 08:01 144                        /lib64/libm-2.17.so
2b950f252000-2b950f408000 r-xp 00000000 08:01 44                         /lib64/libc-2.17.so
2b950f408000-2b950f608000 ---p 001b6000 08:01 44                         /lib64/libc-2.17.so
2b950f608000-2b950f60c000 r--p 001b6000 08:01 44                         /lib64/libc-2.17.so
2b950f60c000-2b950f60e000 rw-p 001ba000 08:01 44                         /lib64/libc-2.17.so
2b950f60e000-2b950f616000 rw-p 00000000 00:00 0 
2b9510063000-2b9510084000 rw-p 00000000 00:00 0                          [heap]
2b9510084000-2b95165ab000 r--p 00000000 08:01 813623                     /usr/lib/locale/locale-archive
2b95165ab000-2b95165b2000 r--s 00000000 08:01 794980                     /usr/lib64/gconv/gconv-modules.cache
2b95165b2000-2b95165b4000 r-xp 00000000 08:01 788237                     /usr/lib64/gconv/ISO8859-15.so
2b95165b4000-2b95167b3000 ---p 00002000 08:01 788237                     /usr/lib64/gconv/ISO8859-15.so
2b95167b3000-2b95167b4000 r--p 00001000 08:01 788237                     /usr/lib64/gconv/ISO8859-15.so
2b95167b4000-2b95167b5000 rw-p 00002000 08:01 788237                     /usr/lib64/gconv/ISO8859-15.so
2b95167b5000-2b9516df7000 rw-p 00000000 00:00 0 
2b9516df7000-2b9516dfa000 r--p 00000000 00:14 7293824361                 /data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/lib-dynload/readline.cpython-310-x86_64-linux-gnu.so
2b9516dfa000-2b9516dfc000 r-xp 00003000 00:14 7293824361                 /data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/lib-dynload/readline.cpython-310-x86_64-linux-gnu.so
2b9516dfc000-2b9516dfe000 r--p 00005000 00:14 7293824361                 /data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/lib-dynload/readline.cpython-310-x86_64-linux-gnu.so
2b9516dfe000-2b9516dff000 ---p 00007000 00:14 7293824361                 /data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/lib-dynload/readline.cpython-310-x86_64-linux-gnu.so
2b9516dff000-2b9516e00000 r--p 00007000 00:14 7293824361                 /data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/lib-dynload/readline.cpython-310-x86_64-linux-gnu.so
2b9516e00000-2b9516e01000 rw-p 00008000 00:14 7293824361                 /data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/lib-dynload/readline.cpython-310-x86_64-linux-gnu.so
2b9516e01000-2b9516e1a000 r--p 00000000 00:14 7292773993                 /data/HeLab/miniconda3/envs/scvpy10/lib/libreadline.so.8.2
2b9516e1a000-2b9516e46000 r-xp 00019000 00:14 7292773993                 /data/HeLab/miniconda3/envs/scvpy10/lib/libreadline.so.8.2
2b9516e46000-2b9516e50000 r--p 00045000 00:14 7292773993                 /data/HeLab/miniconda3/envs/scvpy10/lib/libreadline.so.8.2
2b9516e50000-2b9516e53000 r--p 0004f000 00:14 7292773993                 /data/HeLab/miniconda3/envs/scvpy10/lib/libreadline.so.8.2
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/site-packages/cellrank/kernels/_velocity_kernel.py", line 160, in compute_transition_matrix
    softmax_scale = self._estimate_softmax_scale(backward_mode=backward_mode, similarity=similarity)
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/site-packages/cellrank/kernels/_velocity_kernel.py", line 254, in _estimate_softmax_scale
    _, logits = model(n_jobs, backend)
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/site-packages/cellrank/kernels/utils/_velocity_model.py", line 64, in __call__
    return parallelize(
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/site-packages/cellrank/_utils/_parallelize.py", line 90, in wrapper
    queue = multiprocessing.Manager().Queue()
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/multiprocessing/managers.py", line 723, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/multiprocessing/managers.py", line 606, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/multiprocessing/connection.py", line 513, in Client
    answer_challenge(c, authkey)
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/multiprocessing/connection.py", line 757, in answer_challenge
    message = connection.recv_bytes(256)         # reject large message
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/multiprocessing/connection.py", line 221, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/multiprocessing/connection.py", line 419, in _recv_bytes
    buf = self._recv(4)
  File "/data/HeLab/miniconda3/envs/scvpy10/lib/python3.10/multiprocessing/connection.py", line 384, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt

Versions:

cellrank==2.0.2 scanpy==1.9.6 anndata==0.10.3 numpy==1.26.2 numba==0.58.1 scipy==1.11.4 pandas==2.1.4 pygpcca==1.0.4 scikit-learn==1.1.3 statsmodels==0.14.1 scvelo==0.3.1 pygam==0.8.0 matplotlib==3.7.1 seaborn==0.12.2

Besides my python is 3.10 and I am on a Centos 6 server. I'm a little bit concerned about using an old version of system, does it matter? I have seen many example when ppl using Ubuntu they simply sudo install libmalloc-minimal4 but it seems to be a different situation.

KforKuma commented 10 months ago

Here comes another kind of bug: after I ran scv.tl.recover_dynamics(adata, n_jobs=8) and then gc.collect() it corrupted. Maybe I'd better make it a issue in scvelo too?

Marius1311 commented 10 months ago

Here comes another kind of bug: after I ran scv.tl.recover_dynamics(adata, n_jobs=8) and then gc.collect() it corrupted. Maybe I'd better make it a issue in scvelo too?

yes, that seems like an scvelo bug.

Marius1311 commented 10 months ago

Hi @michalk8, do you have any idea of what could have caused the original bug here?

Marius1311 commented 6 months ago

Does this issue still persist @KforKuma ?

KforKuma commented 6 months ago

nope... I tried a few times on several different python version & cellrank version. Somehow I believe this may be related to scikit-learn? Because I repetitively encounter It seems that scikit-learn has not been built correctly. sort of problem If I import cellrank (or any other package depends on sklearn) first , even though it seems to be installed nicely, as when I run python -c "import sklearn; sklearn.show_versions()" it returns:

System:
    python: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
executable: /data/HeLab/miniconda3/envs/cellranknew/bin/python
   machine: Linux-2.6.32-431.11.2.el6.x86_64-x86_64-with-glibc2.17

Python dependencies:
      sklearn: 1.4.2
          pip: 24.0
   setuptools: 69.5.1
        numpy: 1.26.4
        scipy: 1.13.0
       Cython: None
       pandas: 2.2.2
   matplotlib: 3.8.4
       joblib: 1.4.2
threadpoolctl: 3.5.0

Built with OpenMP: True

threadpoolctl info:
       user_api: openmp
   internal_api: openmp
    num_threads: 24
         prefix: libgomp
       filepath: /data/HeLab/miniconda3/envs/cellranknew/lib/python3.10/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
        version: None

       user_api: blas
   internal_api: openblas
    num_threads: 24
         prefix: libopenblas
       filepath: /data/HeLab/miniconda3/envs/cellranknew/lib/libopenblasp-r0.3.27.so
        version: 0.3.27
threading_layer: pthreads
   architecture: Sandybridge

I have to import sklearn firstly to avoid abovementioned error. Could this implies some problem there?