pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
Describe the bug
First of all, I have been running pySCENIC in a distributed cloud environment, not in an HPC environment. My issue occurs when running ctx in the "dask_cluster" mode, which is the only mode suitable for my underlying infrastructure (custom_multiprocessing or even dask_multiprocessing would not allow me to use my resources efficiently).
What is the issue with this? The issue occurs when I choose the "dask_cluster" mode, which would naturally require me to pass my Dask cluster's IP as an extra CLI argument, which would be client_or_address. However, since args.mode is, for some reason, passed as prune2df()'s client_or_address keyword argument, the "dask_cluster" string is obviously rejected since it is not a valid IP address.
The solution would be to switch args.mode with args.client_or_address in this particular line (src/pyscenic/cli/pyscenic.py#L243).
2023-03-15 10:51:11,205 - pyscenic.cli.pyscenic - INFO - Calculating regulons.
Traceback (most recent call last):
File "/opt/conda/bin/pyscenic", line 8, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 713, in main
args.func(args)
File "/opt/conda/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 236, in prune_targets_command
df_motifs = calc_func(
File "/opt/conda/lib/python3.8/site-packages/pyscenic/prune.py", line 424, in prune2df
return _distributed_calc(
File "/opt/conda/lib/python3.8/site-packages/pyscenic/prune.py", line 205, in _distributed_calc
assert is_valid(
AssertionError: "dask_cluster"is not valid for parameter client_or_address.
Expected behavior
I expect {dask_scheduler_IP} to be provided to prune2df() as the corresponding client_or_address argument, not the "dask_cluster" string.
Describe the bug First of all, I have been running pySCENIC in a distributed cloud environment, not in an HPC environment. My issue occurs when running ctx in the "dask_cluster" mode, which is the only mode suitable for my underlying infrastructure (custom_multiprocessing or even dask_multiprocessing would not allow me to use my resources efficiently).
The source of the bug can be easily identified in this line of code: https://github.com/aertslab/pySCENIC/blob/master/src/pyscenic/cli/pyscenic.py#L243, where
args.mode
is passed asclient_or_address
toprune2df()
.What is the issue with this? The issue occurs when I choose the "dask_cluster" mode, which would naturally require me to pass my Dask cluster's IP as an extra CLI argument, which would be
client_or_address
. However, sinceargs.mode
is, for some reason, passed asprune2df()
'sclient_or_address
keyword argument, the "dask_cluster" string is obviously rejected since it is not a valid IP address.The solution would be to switch
args.mode
withargs.client_or_address
in this particular line (src/pyscenic/cli/pyscenic.py#L243).Steps to reproduce the behavior
Command run when the error occurred:
Error encountered:
Expected behavior I expect {dask_scheduler_IP} to be provided to prune2df() as the corresponding client_or_address argument, not the "dask_cluster" string.
Please complete the following information: