scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.87k stars 595 forks source link

leiden alg with igraph flavor causes out of bounds freezing #2969

Open RubenVanEsch opened 5 months ago

RubenVanEsch commented 5 months ago

Please make sure these conditions are met

What happened?

was running the standard pipeline on some data and when i run sc.tl.leiden(em_adata,flavor='igraph',n_iterations=2,random_state=1653,directed=False) it spits out infinite lines of ignored exceptions. it does not actually crash the kernel, but does bog it down and causes everything to to take much more time than necesarry. I am working in a conda env on a Win 10 , 64bit, x64 system the problem also occurs using the pbmc3k dataset

Minimal code sample

# example with own data, but same happens with pbmc3k data
sc.pp.filter_cells(em_adata, min_genes=200)
sc.pp.filter_genes(em_adata, min_cells=3)
em_adata.shape
# [out] -> (42753, 21636)

sc.pp.calculate_qc_metrics(em_adata, qc_vars=["mt"], percent_top=None, log1p=False, inplace=True)
em_adata.obs["outlier_mt"] = em_adata.obs.pct_counts_mt > 15
em_adata.obs["outlier_total"] = em_adata.obs.total_counts > 30000
em_adata.obs["outlier_ngenes"] = em_adata.obs.n_genes_by_counts > 6000
em_adata = em_adata[~em_adata.obs["outlier_mt"], :]
em_adata = em_adata[~em_adata.obs["outlier_total"], :]
em_adata = em_adata[~em_adata.obs["outlier_ngenes"], :]
sc.pp.filter_genes(em_adata,min_cells=1)

sc.pp.scrublet(em_adata)
em_adata.layers['counts'] = em_adata.X.copy()
sc.pp.normalize_total(em_adata)
sc.pp.log1p(em_adata)
sc.pp.highly_variable_genes(em_adata,flavor='seurat')
sc.pl.highly_variable_genes(em_adata)
em_adata = em_adata[:, em_adata.var["highly_variable"]]
em_adata.shape
# [out] -> (41749, 1425)
sc.pp.pca(em_adata, n_comps=50)
sc.pp.neighbors(em_adata)
sc.tl.umap(em_adata)
sc.tl.leiden(em_adata,flavor='igraph',n_iterations=2,random_state=1653,directed=False)

Error output

Exception ignored in: <class 'ValueError'>
Traceback (most recent call last):
  File "numpy\\random\\mtrand.pyx", line 780, in numpy.random.mtrand.RandomState.randint
  File "numpy\\random\\_bounded_integers.pyx", line 2881, in numpy.random._bounded_integers._rand_int32
ValueError: high is out of bounds for int32

Versions

``` conda env: # Name Version Build Channel _r-mutex 1.0.0 anacondar_1 anndata 0.10.6 pypi_0 pypi anyio 4.3.0 pypi_0 pypi argon2-cffi 23.1.0 pypi_0 pypi argon2-cffi-bindings 21.2.0 py311h2bbff1b_0 array-api-compat 1.5.1 pypi_0 pypi arrow 1.3.0 pypi_0 pypi asttokens 2.4.1 pypi_0 pypi async-lru 2.0.4 py311haa95532_0 attrs 23.2.0 pypi_0 pypi babel 2.14.0 pypi_0 pypi beautifulsoup4 4.12.3 pypi_0 pypi bleach 6.1.0 pypi_0 pypi brotli-python 1.0.9 py311hd77b12b_7 bzip2 1.0.8 h2bbff1b_5 ca-certificates 2023.12.12 haa95532_0 certifi 2024.2.2 py311haa95532_0 cffi 1.16.0 py311h2bbff1b_0 charset-normalizer 3.3.2 pypi_0 pypi colorama 0.4.6 py311haa95532_0 comm 0.2.2 pypi_0 pypi contourpy 1.2.0 pypi_0 pypi cycler 0.12.1 pypi_0 pypi debugpy 1.8.1 pypi_0 pypi decorator 5.1.1 pyhd3eb1b0_0 defusedxml 0.7.1 pyhd3eb1b0_0 executing 2.0.1 pypi_0 pypi fastjsonschema 2.19.1 pypi_0 pypi fonttools 4.50.0 pypi_0 pypi fqdn 1.5.1 pypi_0 pypi h11 0.14.0 pypi_0 pypi h5py 3.10.0 pypi_0 pypi httpcore 1.0.4 pypi_0 pypi httpx 0.27.0 pypi_0 pypi idna 3.6 pypi_0 pypi igraph 0.11.4 pypi_0 pypi imageio 2.34.0 pypi_0 pypi ipykernel 6.29.3 pypi_0 pypi ipython 8.22.2 pypi_0 pypi ipywidgets 8.1.2 pypi_0 pypi isoduration 20.11.0 pypi_0 pypi jedi 0.19.1 pypi_0 pypi jinja2 3.1.3 py311haa95532_0 joblib 1.3.2 pypi_0 pypi json5 0.9.22 pypi_0 pypi jsonpointer 2.4 pypi_0 pypi jsonschema 4.21.1 pypi_0 pypi jsonschema-specifications 2023.12.1 pypi_0 pypi jupyter-client 8.6.1 pypi_0 pypi jupyter-core 5.7.2 pypi_0 pypi jupyter-events 0.9.1 pypi_0 pypi jupyter-lsp 2.2.4 pypi_0 pypi jupyter-server 2.13.0 pypi_0 pypi jupyter-server-terminals 0.5.3 pypi_0 pypi jupyter_client 8.6.0 py311haa95532_0 jupyter_core 5.5.0 py311haa95532_0 jupyter_events 0.8.0 py311haa95532_0 jupyter_server 2.10.0 py311haa95532_0 jupyter_server_terminals 0.4.4 py311haa95532_1 jupyterlab 4.1.5 pypi_0 pypi jupyterlab-pygments 0.3.0 pypi_0 pypi jupyterlab-server 2.25.4 pypi_0 pypi jupyterlab-widgets 3.0.10 pypi_0 pypi jupyterlab_pygments 0.1.2 py_0 jupyterlab_server 2.25.1 py311haa95532_0 kiwisolver 1.4.5 pypi_0 pypi lazy-loader 0.3 pypi_0 pypi legacy-api-wrap 1.4 pypi_0 pypi leidenalg 0.10.2 pypi_0 pypi libffi 3.4.4 hd77b12b_0 libsodium 1.0.18 h62dcd97_0 llvmlite 0.42.0 pypi_0 pypi m2w64-bwidget 1.9.10 2 m2w64-bzip2 1.0.6 6 m2w64-expat 2.1.1 2 m2w64-fftw 3.3.4 6 m2w64-flac 1.3.1 # Name Version Build Channel _r-mutex 1.0.0 anacondar_1 anndata 0.10.6 pypi_0 pypi anyio 4.3.0 pypi_0 pypi argon2-cffi 23.1.0 pypi_0 pypi argon2-cffi-bindings 21.2.0 py311h2bbff1b_0 array-api-compat 1.5.1 pypi_0 pypi arrow 1.3.0 pypi_0 pypi asttokens 2.4.1 pypi_0 pypi async-lru 2.0.4 py311haa95532_0 attrs 23.2.0 pypi_0 pypi babel 2.14.0 pypi_0 pypi beautifulsoup4 4.12.3 pypi_0 pypi bleach 6.1.0 pypi_0 pypi brotli-python 1.0.9 py311hd77b12b_7 bzip2 1.0.8 h2bbff1b_5 ca-certificates 2023.12.12 haa95532_0 certifi 2024.2.2 py311haa95532_0 cffi 1.16.0 py311h2bbff1b_0 chardet 5.2.0 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi colorama 0.4.6 py311haa95532_0 comm 0.2.2 pypi_0 pypi contourpy 1.2.0 pypi_0 pypi cycler 0.12.1 pypi_0 pypi debugpy 1.8.1 pypi_0 pypi decorator 5.1.1 pyhd3eb1b0_0 defusedxml 0.7.1 pyhd3eb1b0_0 executing 2.0.1 pypi_0 pypi fastjsonschema 2.19.1 pypi_0 pypi fonttools 4.50.0 pypi_0 pypi fqdn 1.5.1 pypi_0 pypi h11 0.14.0 pypi_0 pypi h5py 3.10.0 pypi_0 pypi httpcore 1.0.4 pypi_0 pypi httpx 0.27.0 pypi_0 pypi idna 3.6 pypi_0 pypi igraph 0.11.4 pypi_0 pypi imageio 2.34.0 pypi_0 pypi ipykernel 6.29.3 pypi_0 pypi ipython 8.22.2 pypi_0 pypi ipywidgets 8.1.2 pypi_0 pypi isoduration 20.11.0 pypi_0 pypi jedi 0.19.1 pypi_0 pypi jinja2 3.1.3 py311haa95532_0 joblib 1.3.2 pypi_0 pypi json5 0.9.22 pypi_0 pypi jsonpointer 2.4 pypi_0 pypi jsonschema 4.21.1 pypi_0 pypi jsonschema-specifications 2023.12.1 pypi_0 pypi jupyter-client 8.6.1 pypi_0 pypi jupyter-core 5.7.2 pypi_0 pypi jupyter-events 0.9.1 pypi_0 pypi jupyter-lsp 2.2.4 pypi_0 pypi jupyter-server 2.13.0 pypi_0 pypi jupyter-server-terminals 0.5.3 pypi_0 pypi jupyter_client 8.6.0 py311haa95532_0 jupyter_core 5.5.0 py311haa95532_0 jupyter_events 0.8.0 py311haa95532_0 jupyter_server 2.10.0 py311haa95532_0 jupyter_server_terminals 0.4.4 py311haa95532_1 jupyterlab 4.1.5 pypi_0 pypi jupyterlab-pygments 0.3.0 pypi_0 pypi jupyterlab-server 2.25.4 pypi_0 pypi jupyterlab-widgets 3.0.10 pypi_0 pypi jupyterlab_pygments 0.1.2 py_0 jupyterlab_server 2.25.1 py311haa95532_0 kiwisolver 1.4.5 pypi_0 pypi lazy-loader 0.3 pypi_0 pypi legacy-api-wrap 1.4 pypi_0 pypi leidenalg 0.10.2 pypi_0 pypi libffi 3.4.4 hd77b12b_0 libsodium 1.0.18 h62dcd97_0 llvmlite 0.42.0 pypi_0 pypi m2w64-bwidget 1.9.10 2 m2w64-bzip2 1.0.6 6 m2w64-expat 2.1.1 2 m2w64-fftw 3.3.4 6 m2w64-flac 1.3.1 3 m2w64-gcc-libgfortran 5.3.0 6 m2w64-gcc-libs 5.3.0 7 m2w64-gcc-libs-core 5.3.0 7 m2w64-gettext 0.19.7 2 m2w64-gmp 6.1.0 2 m2w64-gsl 2.1 2 m2w64-libiconv 1.14 6 m2w64-libjpeg-turbo 1.4.2 3 m2w64-libogg 1.3.2 3 m2w64-libpng 1.6.21 2 m2w64-libsndfile 1.0.26 2 m2w64-libsodium 1.0.10 2 m2w64-libtiff 4.0.6 2 m2w64-libvorbis 1.3.5 2 m2w64-libwinpthread-git 5.0.0.4634.697f757 2 m2w64-libxml2 2.9.3 4 m2w64-mpfr 3.1.4 4 m2w64-openblas 0.2.19 1 m2w64-pcre 8.38 2 m2w64-speex 1.2rc2 3 m2w64-speexdsp 1.2rc3 3 m2w64-tcl 8.6.5 3 m2w64-tk 8.6.5 3 m2w64-tktable 2.10 5 m2w64-wineditline 2.101 5 m2w64-xz 5.2.2 2 m2w64-zeromq 4.1.4 2 m2w64-zlib 1.2.8 10 markupsafe 2.1.5 pypi_0 pypi matplotlib 3.8.3 pypi_0 pypi matplotlib-inline 0.1.6 py311haa95532_0 mistune 3.0.2 pypi_0 pypi msys2-conda-epoch 20160418 1 natsort 8.4.0 pypi_0 pypi nbclient 0.10.0 pypi_0 pypi nbconvert 7.16.2 pypi_0 pypi nbformat 5.10.3 pypi_0 pypi nest-asyncio 1.6.0 py311haa95532_0 networkx 3.2.1 pypi_0 pypi notebook 7.1.2 pypi_0 pypi notebook-shim 0.2.4 pypi_0 pypi numba 0.59.1 pypi_0 pypi numpy 1.26.4 pypi_0 pypi openssl 3.0.13 h2bbff1b_0 overrides 7.7.0 pypi_0 pypi packaging 24.0 pypi_0 pypi pandas 2.2.1 pypi_0 pypi pandocfilters 1.5.1 pypi_0 pypi parso 0.8.3 pyhd3eb1b0_0 patsy 0.5.6 pypi_0 pypi pillow 10.2.0 pypi_0 pypi pip 23.3.1 py311haa95532_0 platformdirs 4.2.0 pypi_0 pypi plotly 5.20.0 pypi_0 pypi prometheus-client 0.20.0 pypi_0 pypi prometheus_client 0.14.1 py311haa95532_0 prompt-toolkit 3.0.43 py311haa95532_0 prompt_toolkit 3.0.43 hd3eb1b0_0 psutil 5.9.8 pypi_0 pypi pure_eval 0.2.2 pyhd3eb1b0_0 pycparser 2.21 pyhd3eb1b0_0 pygments 2.17.2 pypi_0 pypi pynndescent 0.5.11 pypi_0 pypi pyparsing 3.1.2 pypi_0 pypi pysocks 1.7.1 py311haa95532_0 python 3.11.8 he1021f5_0 python-dateutil 2.9.0.post0 pypi_0 pypi python-fastjsonschema 2.16.2 py311haa95532_0 python-json-logger 2.0.7 py311haa95532_0 pytz 2023.3.post1 py311haa95532_0 pywin32 306 pypi_0 pypi pywinpty 2.0.13 pypi_0 pypi pyyaml 6.0.1 py311h2bbff1b_0 pyzmq 25.1.2 py311hd77b12b_0 r-askpass 1.0 r36_0 r-assertthat 0.2.1 r36h6115d3f_0 r-backports 1.1.4 r36h6115d3f_0 r-base 3.6.1 hf18239d_1 r-base64enc 0.1_3 r36h6115d3f_4 r-bh 1.69.0_1 r36h6115d3f_0 r-boot 1.3_20 r36h6115d3f_0 r-broom 0.5.2 r36h6115d3f_0 r-callr 3.2.0 r36h6115d3f_0 r-caret 6.0_83 r36h6115d3f_0 r-cellranger 1.1.0 r36h6115d3f_0 r-class 7.3_15 r36h6115d3f_0 r-cli 1.1.0 r36h6115d3f_0 r-clipr 0.6.0 r36h6115d3f_0 r-cluster 2.0.8 r36h6115d3f_0 r-codetools 0.2_16 r36h6115d3f_0 r-colorspace 1.4_1 r36h6115d3f_0 r-crayon 1.3.4 r36h6115d3f_0 r-curl 3.3 r36h6115d3f_0 r-data.table 1.12.2 r36h6115d3f_0 r-dbi 1.0.0 r36h6115d3f_0 r-dbplyr 1.4.0 r36h6115d3f_0 r-dichromat 2.0_0 r36h6115d3f_4 r-digest 0.6.18 r36h6115d3f_0 r-dplyr 0.8.0.1 r36h6115d3f_0 r-ellipsis 0.1.0 r36h6115d3f_0 r-essentials 3.6.0 r36_0 r-evaluate 0.13 r36h6115d3f_0 r-fansi 0.4.0 r36h6115d3f_0 r-forcats 0.4.0 r36h6115d3f_0 r-foreach 1.4.4 r36h6115d3f_0 r-foreign 0.8_71 r36h6115d3f_0 r-formatr 1.6 r36h6115d3f_0 r-fs 1.2.7 r36h6115d3f_0 r-generics 0.0.2 r36h6115d3f_0 r-ggplot2 3.1.1 r36h6115d3f_0 r-glmnet 2.0_16 r36h6115d3f_0 r-glue 1.3.1 r36h6115d3f_0 r-gower 0.2.0 r36h6115d3f_0 r-gtable 0.3.0 r36h6115d3f_0 r-haven 2.1.0 r36h6115d3f_0 r-hexbin 1.27.2 r36h6115d3f_0 r-highr 0.8 r36h6115d3f_0 r-hms 0.4.2 r36h6115d3f_0 r-htmltools 0.3.6 r36h6115d3f_0 r-htmlwidgets 1.3 r36h6115d3f_0 r-httpuv 1.5.1 r36h6115d3f_0 r-httr 1.4.0 r36h6115d3f_0 r-ipred 0.9_8 r36h6115d3f_0 r-irdisplay 0.7.0 r36h6115d3f_0 r-irkernel 0.8.15 r36_0 r-iterators 1.0.10 r36h6115d3f_0 r-jsonlite 1.6 r36h6115d3f_0 r-kernsmooth 2.23_15 r36h6115d3f_4 r-knitr 1.22 r36h6115d3f_0 r-labeling 0.3 r36h6115d3f_4 r-later 0.8.0 r36h6115d3f_0 r-lattice 0.20_38 r36h6115d3f_0 r-lava 1.6.5 r36h6115d3f_0 r-lazyeval 0.2.2 r36h6115d3f_0 r-lubridate 1.7.4 r36h6115d3f_0 r-magrittr 1.5 r36h6115d3f_4 r-maps 3.3.0 r36h6115d3f_0 r-markdown 0.9 r36h6115d3f_0 r-mass 7.3_51.3 r36h6115d3f_0 r-matrix 1.2_17 r36h6115d3f_0 r-mgcv 1.8_28 r36h6115d3f_0 r-mime 0.6 r36h6115d3f_0 r-modelmetrics 1.2.2 r36h6115d3f_0 r-modelr 0.1.4 r36h6115d3f_0 r-munsell 0.5.0 r36h6115d3f_0 r-nlme 3.1_139 r36h6115d3f_0 r-nnet 7.3_12 r36h6115d3f_0 r-numderiv 2016.8_1 r36h6115d3f_0 r-openssl 1.3 r36h6115d3f_0 r-pbdzmq 0.3_3 r36h6115d3f_0 r-pillar 1.3.1 r36h6115d3f_0 r-pkgconfig 2.0.2 r36h6115d3f_0 r-plogr 0.2.0 r36h6115d3f_0 r-plyr 1.8.4 r36h6115d3f_0 r-prettyunits 1.0.2 r36h6115d3f_0 r-processx 3.3.0 r36h6115d3f_0 r-prodlim 2018.04.18 r36h6115d3f_0 r-progress 1.2.0 r36h6115d3f_0 r-promises 1.0.1 r36h6115d3f_0 r-ps 1.3.0 r36h6115d3f_0 r-purrr 0.3.2 r36h6115d3f_0 r-quantmod 0.4_14 r36h6115d3f_0 r-r6 2.4.0 r36h6115d3f_0 r-randomforest 4.6_14 r36h6115d3f_0 r-rbokeh 0.6.3 r36_0 r-rcolorbrewer 1.1_2 r36h6115d3f_0 r-rcpp 1.0.1 r36h6115d3f_0 r-rcpproll 0.3.0 r36h6115d3f_0 r-readr 1.3.1 r36h6115d3f_0 r-readxl 1.3.1 r36h6115d3f_0 r-recipes 0.1.5 r36h6115d3f_0 r-recommended 3.6.0 r36_0 r-rematch 1.0.1 r36h6115d3f_0 r-repr 0.19.2 r36h6115d3f_0 r-reprex 0.2.1 r36h6115d3f_0 r-reshape2 1.4.3 r36h6115d3f_0 r-rlang 0.3.4 r36h6115d3f_0 r-rmarkdown 1.12 r36h6115d3f_0 r-rpart 4.1_15 r36h6115d3f_0 r-rstudioapi 0.10 r36h6115d3f_0 r-rvest 0.3.3 r36h6115d3f_0 r-scales 1.0.0 r36h6115d3f_0 r-selectr 0.4_1 r36h6115d3f_0 r-shiny 1.3.2 r36h6115d3f_0 r-sourcetools 0.1.7 r36h6115d3f_0 r-spatial 7.3_11 r36h6115d3f_4 r-squarem 2017.10_1 r36h6115d3f_0 r-stringi 1.4.3 r36h6115d3f_0 r-stringr 1.4.0 r36h6115d3f_0 r-survival 2.44_1.1 r36h6115d3f_0 r-sys 3.2 r36h6115d3f_0 r-tibble 2.1.1 r36h6115d3f_0 r-tidyr 0.8.3 r36h6115d3f_0 r-tidyselect 0.2.5 r36h6115d3f_0 r-tidyverse 1.2.1 r36h6115d3f_0 r-timedate 3043.102 r36h6115d3f_0 r-tinytex 0.12 r36h6115d3f_0 r-ttr 0.23_4 r36h6115d3f_0 r-utf8 1.1.4 r36h6115d3f_0 r-uuid 0.1_2 r36h6115d3f_4 r-viridislite 0.3.0 r36h6115d3f_0 r-whisker 0.3_2 r36h6115d3f_4 r-withr 2.1.2 r36h6115d3f_0 r-xfun 0.6 r36h6115d3f_0 r-xml2 1.2.0 r36h6115d3f_0 r-xtable 1.8_4 r36h6115d3f_0 r-xts 0.11_2 r36h6115d3f_0 r-yaml 2.2.0 r36h6115d3f_0 r-zoo 1.8_5 r36h6115d3f_0 referencing 0.33.0 pypi_0 pypi requests 2.31.0 py311haa95532_1 rfc3339-validator 0.1.4 py311haa95532_0 rfc3986-validator 0.1.1 py311haa95532_0 rpds-py 0.18.0 pypi_0 pypi scanpy 1.10.0 pypi_0 pypi scikit-image 0.22.0 pypi_0 pypi scikit-learn 1.4.1.post1 pypi_0 pypi scikit-misc 0.3.1 pypi_0 pypi scipy 1.12.0 pypi_0 pypi seaborn 0.13.2 pypi_0 pypi send2trash 1.8.2 py311haa95532_0 session-info 1.0.0 pypi_0 pypi setuptools 68.2.2 py311haa95532_0 six 1.16.0 pyhd3eb1b0_1 sniffio 1.3.1 pypi_0 pypi soupsieve 2.5 py311haa95532_0 sqlite 3.41.2 h2bbff1b_0 stack-data 0.6.3 pypi_0 pypi stack_data 0.2.0 pyhd3eb1b0_0 statsmodels 0.14.1 pypi_0 pypi stdlib-list 0.10.0 pypi_0 pypi tenacity 8.2.3 pypi_0 pypi terminado 0.18.1 pypi_0 pypi texttable 1.7.0 pypi_0 pypi threadpoolctl 3.4.0 pypi_0 pypi tifffile 2024.2.12 pypi_0 pypi tinycss2 1.2.1 py311haa95532_0 tk 8.6.12 h2bbff1b_0 tornado 6.4 pypi_0 pypi tqdm 4.66.2 pypi_0 pypi traitlets 5.14.2 pypi_0 pypi types-python-dateutil 2.9.0.20240315 pypi_0 pypi typing-extensions 4.9.0 py311haa95532_1 typing_extensions 4.9.0 py311haa95532_1 tzdata 2024.1 pypi_0 pypi umap-learn 0.5.5 pypi_0 pypi uri-template 1.3.0 pypi_0 pypi urllib3 2.2.1 pypi_0 pypi vc 14.2 h21ff451_1 vs2015_runtime 14.27.29016 h5e58377_2 wcwidth 0.2.13 pypi_0 pypi webcolors 1.13 pypi_0 pypi webencodings 0.5.1 pypi_0 pypi websocket-client 1.7.0 pypi_0 pypi wheel 0.41.2 py311haa95532_0 widgetsnbextension 4.0.10 pypi_0 pypi win_inet_pton 1.1.0 py311haa95532_0 winpty 0.4.3 4 xz 5.4.6 h8cc25b3_0 yaml 0.2.5 he774522_0 zeromq 4.3.5 hd77b12b_0 zlib 1.2.13 h8cc25b3_0 ```
ilan-gold commented 5 months ago

@RubenVanEsch Can you provide a more minimally reproducible example?

For example, paring down a bit the above to just:

import scanpy as sc

em_adata = sc.datasets.pbmc3k()

sc.pp.pca(em_adata, n_comps=50)
sc.pp.neighbors(em_adata)
sc.tl.umap(em_adata)
sc.tl.leiden(em_adata,flavor='igraph',n_iterations=2,random_state=1653,directed=False)

does not yield any error. Could you share your system info i.e., widows or mac?

RubenVanEsch commented 5 months ago

@ilan-gold your minimal example causes the exact same error:

Exception ignored in: <class 'ValueError'> Traceback (most recent call last): File "numpy\random\mtrand.pyx", line 780, in numpy.random.mtrand.RandomState.randint File "numpy\random\_bounded_integers.pyx", line 2881, in numpy.random._bounded_integers._rand_int32 ValueError: high is out of bounds for int32

if you are curious, it spits the error out 14.210 times (71050 lines of error message)

EDIT: the random state does not seem to matter btw, also happens with different random states

ilan-gold commented 5 months ago

Ok @RubenVanEsch we have to assume that this is a windows problem then. I think we will try to set up a test job and hopefully this catches the problem, although will likely catch others. What happens without a random_state set?

RubenVanEsch commented 5 months ago

@ilan-gold same thing without random state I think there might be some windows error relating to numpy on linux defaulting to 64 bit integer vs windows sometimes defaulting to 32 bit (those were the first couple of google hits when i searched the error). though i dont know where the seed is generated in the source code though.

ilan-gold commented 5 months ago

@RubenVanEsch Yes, and the issue there is that we're not the ones calling randint. We may be able to hack it. I'll have a look at how the pipeline errors out on our CI to maybe see where the call is coming from.

ivirshup commented 5 months ago

If the problem is windows, it's possible it will be solved by numpy 2.0. Not sure how easy the upgrade path to numpy 2.0 will be, however.

ilan-gold commented 5 months ago

I got the test runner to do windows and while there were other errors, this one was seemingly not present: https://dev.azure.com/scverse/scanpy/_build/results?buildId=6287&view=logs&j=4eb20215-89fc-58e4-6218-2c2fa88ddf72&t=482e4b16-75d9-5f8c-9594-aadcd098d2cb&l=3977

We have a test that is strikingly similar to the more minimal example from above: https://github.com/scverse/scanpy/blob/main/scanpy/tests/notebooks/test_pbmc3k.py minus the umap. Could you try this test (which doesn't call umap) and also try it with umap so it's exactly as our little demo and let us know what you get? We also set resolution in the test. This test seems to actually pass on our CI.

In general there will be some back and forth here until we find someone near us with a windows machine since using CI to fix this problem isn't really feasible, but at least we can narrow the scope.

ivirshup commented 5 months ago

@RubenVanEsch, are you able to run this in WSL? Also, does the number you pass for random seed matter?

ilan-gold commented 5 months ago

Also, does the number you pass for random seed matter?

From @RubenVanEsch :

EDIT: the random state does not seem to matter btw, also happens with different random states

RubenVanEsch commented 5 months ago

@ivirshup @ilan-gold just got back to this, thought i could not install wsl as I am on a somewhat company restricted laptop, but turns out i can. installing it now (and probably using that from here on out). will run the tester in a bit and let you know

ivirshup commented 5 months ago
import scanpy as sc

em_adata = sc.datasets.pbmc3k()

sc.pp.pca(em_adata, n_comps=50)
sc.pp.neighbors(em_adata)
sc.tl.umap(em_adata)
sc.tl.leiden(em_adata,flavor='igraph',n_iterations=2,random_state=1653,directed=False)

@melonora, would you mind running this on your windows machine with the latest scanpy release to see if you can reproduce it?

melonora commented 5 months ago

Yes I will and report back. Most likely in the evening.

melonora commented 5 months ago

I can reproduce, this is the error that I get: afbeelding

From a first glance it seems like the default for randint is used which is int32. I can check whether switching to int64 fixes the issue.

melonora commented 5 months ago

I will see if I can reproduce on main and pinpoint where the problem arises.

RubenVanEsch commented 5 months ago

Do you guys still want me to try and run the test from @ilan-gold ? Or is it fine now that it is reproduced on your side as well?

melonora commented 5 months ago

It is reproduced. It is due to the randint producing a value outside the range of the default dtype int32. On windows 64 bit systems the default is int32 despite the system being 64 bit. This is due to default for c long being int32 on these systems.

The part of the code that fails due to this is when using the context manager to perform the leiden clustering with igraph flavor.

melonora commented 5 months ago

In particular here is the piece of code: https://github.com/scverse/scanpy/blob/a33111f3b2caaa4ee5e33d02b6e98b143023341b/scanpy/tools/_leiden.py#L184-L185

Though the randint is called from within c code within igraph itself. @ivirshup, do you think asking for calling with dtype int64 would be a problem until this part is fixed on the numpy side?

ivirshup commented 5 months ago

Where would you put the dtype=int64 argument?

melonora commented 5 months ago

It wouldn't be on our side. As far as I know the numpy random number generator is called from within c code within igraph itself.

flying-sheep commented 1 month ago

Since we can’t test this without your help, could you check if passing your own RNG here makes it work?

melonora commented 1 month ago

I can test tomorrow

patrick-nicodemus commented 2 weeks ago

I can reproduce this bug on my machine as well. I can supply additional information or context if needed, and I can test fixes

patrick-nicodemus commented 1 week ago

If the problem is windows, it's possible it will be solved by numpy 2.0. Not sure how easy the upgrade path to numpy 2.0 will be, however.

I can reproduce the error using Numpy 2.0.2.

ilan-gold commented 1 week ago

@patrick-nicodemus What we need more than anything is someone to test out a fix and to confirm that using wsl prevents the problem.

See https://github.com/scverse/scanpy/pull/3041

The issue is that we don't have windows machines.

patrick-nicodemus commented 1 week ago

@ilan-gold If you want to try it out, I give instructions for how to reproduce the error with a Docker container for Windows in the cross-referenced issue. I also have tried it on WSL, and the problem is not present on WSL, so this is a workaround for Windows users. However, I am organizing a Python workshop in a few weeks, and I think it would add some additional administrative burden/overhead to the workshop to coordinate installing and setting up WSL (as we see in #3041, Ruben had trouble installing WSL and others might as well.) So, for me, using WSL is a suboptimal workaround.

flying-sheep commented 1 week ago

If you want to try it out, I give instructions for how to reproduce the error with a Docker container for Windows in the cross-referenced issue

Yes please. I’m confused how Windows comes into play though since I thougt that Docker always runs on a Linux kernel – natively on Linux and in a VM on macOS and Windows.

patrick-nicodemus commented 1 week ago

Yes, this was my impression too. However there is a documented option "Switch to Windows containers" which is available if you right click on the Docker icon in the taskbar and this allows one to run vms using a Windows kernel.

On Fri, Sep 6, 2024, 3:36 AM Philipp A. @.***> wrote:

If you want to try it out, I give instructions for how to reproduce the error with a Docker container for Windows in the cross-referenced issue

Yes please. I’m confused how Windows comes into play though since I thougt that Docker always runs on a Linux kernel – natively on Linux and in a VM on macOS and Windows.

— Reply to this email directly, view it on GitHub https://github.com/scverse/scanpy/issues/2969#issuecomment-2333436219, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH2OS47KNFAVTYUHGAMORILZVFLRXAVCNFSM6AAAAABFM3NQROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZTGQZTMMRRHE . You are receiving this because you were mentioned.Message ID: @.***>