Describe the bug I am following the description of the full interactive pipeline as detailed in this notebook and having trouble running the pruning stage. Using pySCENIC v0.12.1 (installed from source since PyPI package is broken) I get an error associated with dask when running prune2df and I could not find any related issue in the issue section of this repo.

Mote that most errors are due to the input from the user, and therefore should be treated as questions in the Discussions. Please, only report them as bugs if you are quite certain that they are not behaving as expected.

Steps to reproduce the behavior

Command run when the error occurred:


import anndata as ad
from distributed import Client, LocalCluster

from arboreto.utils import load_tf_names from arboreto.algo import grnboost2

from ctxcore.rnkdb import FeatherRankingDatabase as RankingDatabase from pyscenic.utils import modules_from_adjacencies from pyscenic.prune import prune2df, df2regulons from pyscenic.aucell import aucell

adata = ad.read_h5ad()

with open('../scenic_resource/hs_hgnc_tfs.txt', 'r') as tf_file: tf_names = [line.rstrip() for line in tf_file]

cistarget_db = RankingDatabase( '../scenic_resource/hg38refseq-r8010kb_up_and_down_tss.mc9nr.genes_vs_motifs.rankings.feather', 'hg38refseq-r8010kb_up_and_down_tss.mc9nr' )

manually restrict number of workers used

client = Client( LocalCluster( name='grn_call', n_workers=8, threads_per_worker=1 ) )

adjacencies = grnboost2( expression_data = adata.to_df('counts'), # convert anndata to pandas.DataFrame tf_names = tf_names, client_or_address = client, verbose = True )

inferred_modules = list( modules_from_adjacencies( adjacencies, adata.to_df('counts') ) )

this is actually executed as part of a dict comprehension

because I am computing GRNs for multiple datasets

but the error also occurs in when running it like this

so I kept the code like this for brevity

prune2df( [db], inferred_modules, '../scenic_resource/motifs-v9-nr.hgnc-m0.001-o0.0.tbl', client_or_address = client )


2. Error encountered:
```pytb
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[35], line 12
      2 client = Client(
      3     LocalCluster(
      4         name='grn_call',
   (...)
      7     )
      8 )
     10 dbs = [cistarget_db]
     11 prunded_modules = {
---> 12     k: prune2df(
     13         dbs, 
     14         inferred_modules, 
     15         '../scenic_resource/motifs-v9-nr.hgnc-m0.001-o0.0.tbl',
     16         client_or_address = client
     17     )
     18     for k, inferred_modules
     19     in modules.items()
     20 }

File ~/.conda/envs/scenic/lib/python3.12/site-packages/pyscenic/prune.py:424, in prune2df(rnkdbs, modules, motif_annotations_fname, rank_threshold, auc_threshold, nes_threshold, motif_similarity_fdr, orthologuous_identity_threshold, weighted_recovery, client_or_address, num_workers, module_chunksize, filter_for_annotation)
    418 # Create a distributed dataframe from individual delayed objects to avoid out of memory problems.
    419 aggregation_func = (
    420     partial(from_delayed, meta=DF_META_DATA)
    421     if client_or_address != "custom_multiprocessing"
    422     else pd.concat
    423 )
--> 424 return _distributed_calc(
    425     rnkdbs,
    426     modules,
    427     motif_annotations_fname,
    428     transformation_func,
    429     aggregation_func,
    430     motif_similarity_fdr,
    431     orthologuous_identity_threshold,
    432     client_or_address,
    433     num_workers,
    434     module_chunksize,
    435 )

File ~/.conda/envs/scenic/lib/python3.12/site-packages/pyscenic/prune.py:362, in _distributed_calc(rnkdbs, modules, motif_annotations_fname, transform_func, aggregate_func, motif_similarity_fdr, orthologuous_identity_threshold, client_or_address, num_workers, module_chunksize)
    357 client, shutdown_callback = _prepare_client(
    358     client_or_address,
    359     num_workers=num_workers if num_workers else cpu_count(),
    360 )
    361 try:
--> 362     return client.compute(create_graph(client), sync=True)
    363 finally:
    364     shutdown_callback(False)

File ~/.conda/envs/scenic/lib/python3.12/site-packages/pyscenic/prune.py:340, in _distributed_calc.<locals>.create_graph(client)
    300 delayed_or_future_dbs = list(map(wrap, rnkdbs))
    301 # 3. The gene signatures: these signatures become large when chunking them, therefore chunking is overruled
    302 # when using dask.distributed.
    303 # See earlier.
   (...)
    337 # again be unavoidable. TBI + See following stackoverflow question:
    338 # https://stackoverflow.com/questions/47776936/why-is-a-computation-much-slower-within-a-dask-distributed-worker
--> 340 return aggregate_func(
    341     (
    342         delayed(transform_func)(db, gs_chunk, delayed_or_future_annotations)
    343         for db in delayed_or_future_dbs
    344         for gs_chunk in chunked_iter(modules, module_chunksize)
    345     )
    346 )

File ~/.conda/envs/scenic/lib/python3.12/site-packages/dask_expr/io/_delayed.py:100, in from_delayed(dfs, meta, divisions, prefix, verify_meta)
     97 if isinstance(dfs, Delayed) or hasattr(dfs, "key"):
     98     dfs = [dfs]
--> 100 if len(dfs) == 0:
    101     raise TypeError("Must supply at least one delayed object")
    103 if meta is None:

TypeError: object of type 'generator' has no len()

Expected behavior Expected behaviour is simply that it runs without any error as all the passed arguments comply to the types inferred from the above mentioned notebook.

Please complete the following information:

pySCENIC version: 0.12.1
Installation method: git clone + pip install .
Run environment: Juputer Notebook
OS: CentOS

Package versions: [obtain using pip freeze, conda list, or skip this if using Docker/Singularity]:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
_openmp_mutex             5.1                       1_gnu
aiohttp                   3.9.5                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
anndata                   0.10.7                   pypi_0    pypi
anyio                     4.4.0                    pypi_0    pypi
arboreto                  0.1.6                    pypi_0    pypi
argon2-cffi               23.1.0                   pypi_0    pypi
argon2-cffi-bindings      21.2.0                   pypi_0    pypi
array-api-compat          1.7.1                    pypi_0    pypi
arrow                     1.3.0                    pypi_0    pypi
asttokens                 2.4.1                    pypi_0    pypi
async-lru                 2.0.4                    pypi_0    pypi
attrs                     23.2.0                   pypi_0    pypi
babel                     2.15.0                   pypi_0    pypi
beautifulsoup4            4.12.3                   pypi_0    pypi
bleach                    6.1.0                    pypi_0    pypi
bokeh                     3.4.1                    pypi_0    pypi
boltons                   24.0.0                   pypi_0    pypi
bzip2                     1.0.8                h5eee18b_6
ca-certificates           2024.3.11            h06a4308_0
certifi                   2024.6.2                 pypi_0    pypi
cffi                      1.16.0                   pypi_0    pypi
charset-normalizer        3.3.2                    pypi_0    pypi
click                     8.1.7                    pypi_0    pypi
cloudpickle               3.0.0                    pypi_0    pypi
comm                      0.2.2                    pypi_0    pypi
contourpy                 1.2.1                    pypi_0    pypi
ctxcore                   0.2.0                    pypi_0    pypi
cycler                    0.12.1                   pypi_0    pypi
cytoolz                   0.12.3                   pypi_0    pypi
dask                      2024.5.2                 pypi_0    pypi
dask-expr                 1.1.2                    pypi_0    pypi
debugpy                   1.8.1                    pypi_0    pypi
decorator                 5.1.1                    pypi_0    pypi
defusedxml                0.7.1                    pypi_0    pypi
dill                      0.3.8                    pypi_0    pypi
distributed               2024.5.2                 pypi_0    pypi
executing                 2.0.1                    pypi_0    pypi
expat                     2.6.2                h6a678d5_0
fastjsonschema            2.19.1                   pypi_0    pypi
fonttools                 4.53.0                   pypi_0    pypi
fqdn                      1.5.1                    pypi_0    pypi
frozendict                2.4.4                    pypi_0    pypi
frozenlist                1.4.1                    pypi_0    pypi
fsspec                    2024.6.0                 pypi_0    pypi
h11                       0.14.0                   pypi_0    pypi
h5py                      3.11.0                   pypi_0    pypi
httpcore                  1.0.5                    pypi_0    pypi
httpx                     0.27.0                   pypi_0    pypi
idna                      3.7                      pypi_0    pypi
interlap                  0.2.7                    pypi_0    pypi
ipykernel                 6.29.4                   pypi_0    pypi
ipython                   8.25.0                   pypi_0    pypi
ipywidgets                8.1.3                    pypi_0    pypi
isoduration               20.11.0                  pypi_0    pypi
jedi                      0.19.1                   pypi_0    pypi
jinja2                    3.1.4                    pypi_0    pypi
joblib                    1.4.2                    pypi_0    pypi
json5                     0.9.25                   pypi_0    pypi
jsonpointer               2.4                      pypi_0    pypi
jsonschema                4.22.0                   pypi_0    pypi
jsonschema-specifications 2023.12.1                pypi_0    pypi
jupyter                   1.0.0                    pypi_0    pypi
jupyter-client            8.6.2                    pypi_0    pypi
jupyter-console           6.6.3                    pypi_0    pypi
jupyter-core              5.7.2                    pypi_0    pypi
jupyter-events            0.10.0                   pypi_0    pypi
jupyter-lsp               2.2.5                    pypi_0    pypi
jupyter-server            2.14.1                   pypi_0    pypi
jupyter-server-terminals  0.5.3                    pypi_0    pypi
jupyterlab                4.2.1                    pypi_0    pypi
jupyterlab-pygments       0.3.0                    pypi_0    pypi
jupyterlab-server         2.27.2                   pypi_0    pypi
jupyterlab-widgets        3.0.11                   pypi_0    pypi
kiwisolver                1.4.5                    pypi_0    pypi
lab                       8.2                      pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1
legacy-api-wrap           1.4                      pypi_0    pypi
libffi                    3.4.4                h6a678d5_1
libgcc-ng                 11.2.0               h1234567_1
libgomp                   11.2.0               h1234567_1
libstdcxx-ng              11.2.0               h1234567_1
libuuid                   1.41.5               h5eee18b_0
llvmlite                  0.42.0                   pypi_0    pypi
locket                    1.0.0                    pypi_0    pypi
loompy                    3.0.7                    pypi_0    pypi
lz4                       4.3.3                    pypi_0    pypi
markupsafe                2.1.5                    pypi_0    pypi
matplotlib                3.9.0                    pypi_0    pypi
matplotlib-inline         0.1.7                    pypi_0    pypi
mistune                   3.0.2                    pypi_0    pypi
msgpack                   1.0.8                    pypi_0    pypi
multidict                 6.0.5                    pypi_0    pypi
multiprocessing-on-dill   3.5.0a4                  pypi_0    pypi
natsort                   8.4.0                    pypi_0    pypi
nbclient                  0.10.0                   pypi_0    pypi
nbconvert                 7.16.4                   pypi_0    pypi
nbformat                  5.10.4                   pypi_0    pypi
ncurses                   6.4                  h6a678d5_0
nest-asyncio              1.6.0                    pypi_0    pypi
networkx                  3.3                      pypi_0    pypi
notebook                  7.2.0                    pypi_0    pypi
notebook-shim             0.2.4                    pypi_0    pypi
numba                     0.59.1                   pypi_0    pypi
numexpr                   2.10.0                   pypi_0    pypi
numpy                     1.26.4                   pypi_0    pypi
numpy-groupies            0.11.1                   pypi_0    pypi
openssl                   3.0.13               h7f8727e_2
overrides                 7.7.0                    pypi_0    pypi
packaging                 24.0                     pypi_0    pypi
pandas                    2.2.2                    pypi_0    pypi
pandocfilters             1.5.1                    pypi_0    pypi
parso                     0.8.4                    pypi_0    pypi
partd                     1.4.2                    pypi_0    pypi
patsy                     0.5.6                    pypi_0    pypi
pexpect                   4.9.0                    pypi_0    pypi
pillow                    10.3.0                   pypi_0    pypi
pip                       24.0                     pypi_0    pypi
platformdirs              4.2.2                    pypi_0    pypi
prometheus-client         0.20.0                   pypi_0    pypi
prompt-toolkit            3.0.46                   pypi_0    pypi
psutil                    5.9.8                    pypi_0    pypi
ptyprocess                0.7.0                    pypi_0    pypi
pure-eval                 0.2.2                    pypi_0    pypi
pyarrow                   16.1.0                   pypi_0    pypi
pyarrow-hotfix            0.6                      pypi_0    pypi
pycparser                 2.22                     pypi_0    pypi
pygments                  2.18.0                   pypi_0    pypi
pynndescent               0.5.12                   pypi_0    pypi
pyparsing                 3.1.2                    pypi_0    pypi
pyscenic                  0.12.1+8.gd2309fe          pypi_0    pypi
python                    3.12.3               h996f2a0_1
python-dateutil           2.9.0.post0              pypi_0    pypi
python-json-logger        2.0.7                    pypi_0    pypi
pytz                      2024.1                   pypi_0    pypi
pyyaml                    6.0.1                    pypi_0    pypi
pyzmq                     26.0.3                   pypi_0    pypi
qtconsole                 5.5.2                    pypi_0    pypi
qtpy                      2.4.1                    pypi_0    pypi
readline                  8.2                  h5eee18b_0
referencing               0.35.1                   pypi_0    pypi
requests                  2.32.3                   pypi_0    pypi
rfc3339-validator         0.1.4                    pypi_0    pypi
rfc3986-validator         0.1.1                    pypi_0    pypi
rpds-py                   0.18.1                   pypi_0    pypi
scanpy                    1.10.1                   pypi_0    pypi
scikit-learn              1.5.0                    pypi_0    pypi
scipy                     1.13.1                   pypi_0    pypi
seaborn                   0.13.2                   pypi_0    pypi
send2trash                1.8.3                    pypi_0    pypi
session-info              1.0.0                    pypi_0    pypi
setuptools                69.5.1                   pypi_0    pypi
simplejson                3.19.2                   pypi_0    pypi
six                       1.16.0                   pypi_0    pypi
sniffio                   1.3.1                    pypi_0    pypi
sortedcontainers          2.4.0                    pypi_0    pypi
soupsieve                 2.5                      pypi_0    pypi
sqlite                    3.45.3               h5eee18b_0
stack-data                0.6.3                    pypi_0    pypi
statsmodels               0.14.2                   pypi_0    pypi
stdlib-list               0.10.0                   pypi_0    pypi
tblib                     3.0.0                    pypi_0    pypi
terminado                 0.18.1                   pypi_0    pypi
threadpoolctl             3.5.0                    pypi_0    pypi
tinycss2                  1.3.0                    pypi_0    pypi
tk                        8.6.14               h39e8969_0
toolz                     0.12.1                   pypi_0    pypi
tornado                   6.4.1                    pypi_0    pypi
tqdm                      4.66.4                   pypi_0    pypi
traitlets                 5.14.3                   pypi_0    pypi
txt2tags                  3.9                      pypi_0    pypi
types-python-dateutil     2.9.0.20240316           pypi_0    pypi
tzdata                    2024.1                   pypi_0    pypi
umap-learn                0.5.6                    pypi_0    pypi
uri-template              1.3.0                    pypi_0    pypi
urllib3                   2.2.1                    pypi_0    pypi
wcwidth                   0.2.13                   pypi_0    pypi
webcolors                 24.6.0                   pypi_0    pypi
webencodings              0.5.1                    pypi_0    pypi
websocket-client          1.8.0                    pypi_0    pypi
wheel                     0.43.0                   pypi_0    pypi
widgetsnbextension        4.0.11                   pypi_0    pypi
xyzservices               2024.6.0                 pypi_0    pypi
xz                        5.4.6                h5eee18b_1
yarl                      1.9.4                    pypi_0    pypi
zict                      3.0.0                    pypi_0    pypi
zlib                      1.2.13               h5eee18b_1

aertslab / pySCENIC

prune2df fails with TypeError: object of type 'generator' has no len() [BUG] #551

manually restrict number of workers used

this is actually executed as part of a dict comprehension

because I am computing GRNs for multiple datasets

but the error also occurs in when running it like this

so I kept the code like this for brevity