aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
438 stars 181 forks source link

[BUG] Error in running pyscenic through docker image #395

Open thereallda opened 2 years ago

thereallda commented 2 years ago

Hi,

First, I would like to thanks for the great package and nice tutorial.

Describe the bug

I followed the protocol of PBMC10k. Everything went smooth before the pyscenic step. When I tried to run the pyscenic grn using docker image the error occurred.

Steps to reproduce the behavior

  1. Command run when the error occurred:
docker run -it --rm \
-v $PWD:/data aertslab/pyscenic:0.10.0 pyscenic grn \
--num_workers 20 \
-o /data/adj.tsv \
-m grnboost2 \
/data/PBMC10k_filtered.loom /data/hs_hgnc_tfs.txt

The current working directory contained the .loom file and tf list

$ ls
arboreto_with_multiprocessing.py
dask-worker-space
filtered_feature_bc_matrix
hg38__refseq-r80__10kb_up_and_down_tss.mc9nr.feather
hs_hgnc_tfs.txt
motifs-v9-nr.hgnc-m0.001-o0.0.tbl
PBMC10k_filtered.loom
pbmc_10k_v3_filtered_feature_bc_matrix.tar.gz
SCENIC_protocol.ipynb
  1. Error encountered:
    
    2022-06-07 09:53:16,066 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2022-06-07 09:53:17,727 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks. /opt/venv/lib/python3.7/site-packages/dask/config.py:161: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. data = yaml.load(f.read()) or {} preparing dask client parsing input /opt/venv/lib/python3.7/site-packages/arboreto/algo.py:214: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. expression_matrix = expression_data.as_matrix() creating dask graph 20 partitions computing dask graph distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%) distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%) distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%) distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%) distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%) distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%) distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting distributed.nanny - WARNING - Worker process 76 was killed by signal 15 distributed.scheduler - ERROR - Workers don't have promised key: ['tcp://127.0.0.1:40614'], finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e NoneType: None distributed.client - WARNING - Couldn't gather 1 keys, rescheduling {'finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e': ('tcp://127.0.0.1:40614',)} distributed.nanny - WARNING - Restarting worker distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%) distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%) distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%) ... distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%) distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting distributed.scheduler - ERROR - Workers don't have promised key: ['tcp://127.0.0.1:42724'], finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e NoneType: None distributed.nanny - WARNING - Worker process 64 was killed by signal 15 distributed.client - WARNING - Couldn't gather 1 keys, rescheduling {'finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e': ('tcp://127.0.0.1:42724',)} distributed.nanny - WARNING - Restarting worker ...



I have checked the memory usage was around 24G when running pyscenic and my machine still have about 90G free memory. 

**Please complete the following information:**
- pySCENIC version: 0.10.0
- Installation method: Docker
- Run environment: CLI
- OS: CentOS 7
- Package versions: NA

Any help would be appreciated!

Thanks!
hyq9588 commented 1 year ago

Have you solved this question? I had the same question. I used the singularity container to run pyscenic(aertslab-pyscenic-0.9.18.sif &aertslab-pyscenic-0.12.1.sif) .

thereallda commented 1 year ago

Have you solved this question? I had the same question. I used the singularity container to run pyscenic(aertslab-pyscenic-0.9.18.sif &aertslab-pyscenic-0.12.1.sif) .

Yes. I solved it by using the PBMC 3k dataset and the following commands.

docker run -it --rm -v $PWD:$PWD -w $PWD aertslab/pyscenic:0.10.0 pyscenic grn --num_workers 20 -o adj_pbmc.tsv -m grnboost2 pbmc3k.loom hs_hgnc_tfs.txt