aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
416 stars 178 forks source link

[BUG]pyscenic grn ERR when running on HPC with slurm #560

Open jung233 opened 1 month ago

jung233 commented 1 month ago

ERR output:

2024-07-17 17:46:40,496 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2024-07-17 17:48:21,231 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks.
2024-07-17 17:48:34,611 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 232.23 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,616 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 234.39 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,618 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 230.18 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,637 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 231.81 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,699 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 229.76 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,748 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 234.01 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,770 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 230.51 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,806 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 232.22 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,826 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 242.59 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,927 - distributed.worker.memory - WARNING - Worker is at 83% memory usage. Pausing worker.  Process memory: 260.54 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,928 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 260.54 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,932 - distributed.worker.memory - WARNING - Worker is at 80% memory usage. Pausing worker.  Process memory: 250.14 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,933 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 250.14 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,955 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 230.29 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,984 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 232.98 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,989 - distributed.worker.memory - WARNING - Worker is at 81% memory usage. Pausing worker.  Process memory: 253.19 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:34,990 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 253.19 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:35,046 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 232.89 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:35,186 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 235.05 MiB -- Worker memory limit: 312.50 MiB
2024-07-17 17:48:36,375 - distributed.nanny.memory - WARNING - Worker tcp://127.0.0.1:33066 (pid=94890) exceeded 95% memory budget. Restarting...
2024-07-17 17:48:36,424 - distributed.nanny - WARNING - Restarting worker
preparing dask client
parsing input
creating dask graph
not shutting down client, client was created externally
finished
Traceback (most recent call last):
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
    frames_nosplit_nbytes_bin = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/share/home/user/miniconda3/envs/pyscenic2/bin/pyscenic", line 8, in <module>
    sys.exit(main())
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 713, in main
    args.func(args)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 97, in find_adjacencies_command
    network = method(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/algo.py", line 39, in grnboost2
    return diy(expression_data=expression_data, regressor_type='GBM', regressor_kwargs=SGBM_KWARGS,
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/algo.py", line 120, in diy
    graph = create_graph(expression_matrix,
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/core.py", line 419, in create_graph
    future_tf_matrix = client.scatter(tf_matrix, broadcast=True)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/client.py", line 2668, in scatter
    return self.sync(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/utils.py", line 364, in sync
    return sync(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/utils.py", line 440, in sync
    raise error
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/utils.py", line 414, in f
    result = yield future
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/tornado/gen.py", line 766, in run
    value = future.result()
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/client.py", line 2538, in _scatter
    await self.scheduler.scatter(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/core.py", line 1398, in send_recv_from_rpc
    return await send_recv(comm=comm, op=key, **kwargs)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/core.py", line 1157, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/comm/tcp.py", line 236, in read
    convert_stream_closed_error(self, e)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) ConnectionPool.scatter local=tcp://127.0.0.1:36830 remote=tcp://127.0.0.1:34248>: Stream is closed

Environment

pip list
Package                 Version
----------------------- -----------
aiohttp                 3.9.5
aiosignal               1.3.1
arboreto                0.1.6
async-timeout           4.0.3
attrs                   23.2.0
bokeh                   3.5.0
boltons                 24.0.0
certifi                 2024.7.4
charset-normalizer      3.3.2
click                   8.1.7
cloudpickle             3.0.0
contourpy               1.2.1
ctxcore                 0.2.0
cytoolz                 0.12.3
dask                    2024.7.0
dask-expr               1.1.7
dill                    0.3.8
distributed             2024.7.0
frozendict              2.4.4
frozenlist              1.4.1
fsspec                  2024.6.1
h5py                    3.11.0
idna                    3.7
importlib_metadata      8.0.0
interlap                0.2.7
Jinja2                  3.1.4
joblib                  1.4.2
llvmlite                0.43.0
locket                  1.0.0
loompy                  3.0.7
lz4                     4.3.3
MarkupSafe              2.1.5
msgpack                 1.0.8
multidict               6.0.5
multiprocessing_on_dill 3.5.0a4
networkx                3.3
numba                   0.60.0
numexpr                 2.10.1
numpy                   1.23.5
numpy-groupies          0.11.1
packaging               24.1
pandas                  2.2.2
partd                   1.4.2
pillow                  10.4.0
pip                     24.0
psutil                  6.0.0
pyarrow                 17.0.0
pyarrow-hotfix          0.6
pynndescent             0.5.13
pyscenic                0.12.1
python-dateutil         2.9.0.post0
pytz                    2024.1
PyYAML                  6.0.1
requests                2.32.3
scikit-learn            1.5.1
scipy                   1.14.0
setuptools              70.3.0
six                     1.16.0
sortedcontainers        2.4.0
tblib                   3.0.0
threadpoolctl           3.5.0
toolz                   0.12.1
tornado                 6.4.1
tqdm                    4.66.4
tzdata                  2024.1
umap-learn              0.5.6
urllib3                 2.2.2
wheel                   0.43.0
xyzservices             2024.6.0
yarl                    1.9.4
zict                    3.0.0
zipp                    3.19.2

Slurm Script:

#!/bin/bash
#SBATCH --job-name=pyscenic_1         
#SBATCH --nodes=1                          
#SBATCH -c 16                         
#SBATCH --partition=fat

source /share/home/user/.bash_profile
mamba activate pyscenic2

pyscenic grn --num_workers 16 \
  --sparse \
  --method grnboost2 \
  --output /share/home/user/pyscenic/sce.adj.csv \
  /share/home/user/pyscenic/expression_matrix.loom \
  /share/home/user/pyscenic/hs_hgnc_tfs.txt

Node:

RAM: 2T ( Slurm does not have a limit on the usage of individual tasks, which means I can occupy all of them)
CPU: 32 cores
When I was running, there were no other tasks on this node and  when I checked the monitoring platform, I could see that the total memory usage of the node was less than 1%

Sample

About 60000 cells with 50000 genes

Thank you very much.

ghuls commented 1 month ago

It is possible that you ran out of file descriptors (open files)?

# Soft limit
ulimit -aS

# Hard limit
ulimit -aH
jung233 commented 1 month ago
[user@login03 ~]$ ulimit -aS
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 512871
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[user@login03 ~]$ ulimit -aH
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 512871
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 512871
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Thanks for your replying. But it seems impossible to run out. How can I further check it?

ghuls commented 1 month ago

open files (-n) 1024 is not a lot. Dask will open a lot of files. In the past we ran into problems with only 1024 allowed open files in combination with dask and had the cluster people change the hard limit to 16384.

You can try to see if 4k open filehandle is enough (your current hard limit). If you need more, you will have to ask your cluster adminstrators to change the hard limit:

ulimit -S -n 4096
jung233 commented 1 month ago

I tried ulimit -S -n 4096 and ulimit -S -n 131072. It doesn't seem to be working.

Slurm script:

#!/bin/bash
#SBATCH --job-name=pyscenic_1  
#SBATCH --nodes=1                          
#SBATCH -c 16                          
#SBATCH --partition=node

source /share/home/user/.bash_profile
mamba activate pyscenic2
echo "Set open files: ulimit -S -n 131072"
ulimit -S -n 131072
echo "CHECK: ulimit -aS"
# Soft limit
ulimit -aS
echo "CHECK: ulimit -aH"
# Hard limit
ulimit -aH
echo "Pyscenic starting"
pyscenic grn --num_workers 16 \
  --sparse \
  --method grnboost2 \
  --output /share/home/user/pyscenic/sce.adj.csv \
  /share/home/user/pyscenic/expression_matrix.loom \
  /share/home/user/pyscenic/hs_hgnc_tfs.txt

Output:

Set open files: ulimit -S -n 131072
CHECK: ulimit -aS
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 766890
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) 5120000
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
CHECK: ulimit -aH
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 766890
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) 5120000
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 766890
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
Pyscenic starting

2024-07-19 18:43:56,086 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2024-07-19 18:45:00,128 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks.
2024-07-19 18:45:16,009 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 247.68 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,010 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 229.58 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,013 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 236.26 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,115 - distributed.worker.memory - WARNING - Worker is at 80% memory usage. Pausing worker.  Process memory: 251.70 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,115 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 251.70 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,117 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 245.62 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,170 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 239.96 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,173 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 234.43 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,209 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 239.17 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,247 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 236.60 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,247 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 233.76 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,298 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 237.86 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,319 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 234.42 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,324 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 232.02 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,325 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 238.15 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,325 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 235.00 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:16,330 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 237.53 MiB -- Worker memory limit: 312.50 MiB
2024-07-19 18:45:17,655 - distributed.nanny.memory - WARNING - Worker tcp://127.0.0.1:33408 (pid=78441) exceeded 95% memory budget. Restarting...
2024-07-19 18:45:17,782 - distributed.nanny - WARNING - Restarting worker
preparing dask client
parsing input
creating dask graph
not shutting down client, client was created externally
finished
Traceback (most recent call last):
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
    frames_nosplit_nbytes_bin = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/share/home/user/miniconda3/envs/pyscenic2/bin/pyscenic", line 8, in <module>
    sys.exit(main())
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 713, in main
    args.func(args)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 97, in find_adjacencies_command
    network = method(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/algo.py", line 39, in grnboost2
    return diy(expression_data=expression_data, regressor_type='GBM', regressor_kwargs=SGBM_KWARGS,
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/algo.py", line 120, in diy
    graph = create_graph(expression_matrix,
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/core.py", line 419, in create_graph
    future_tf_matrix = client.scatter(tf_matrix, broadcast=True)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/client.py", line 2668, in scatter
    return self.sync(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/utils.py", line 364, in sync
    return sync(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/utils.py", line 440, in sync
    raise error
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/utils.py", line 414, in f
    result = yield future
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/tornado/gen.py", line 766, in run
    value = future.result()
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/client.py", line 2538, in _scatter
    await self.scheduler.scatter(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/core.py", line 1398, in send_recv_from_rpc
    return await send_recv(comm=comm, op=key, **kwargs)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/core.py", line 1157, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/comm/tcp.py", line 236, in read
    convert_stream_closed_error(self, e)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) ConnectionPool.scatter local=tcp://127.0.0.1:55942 remote=tcp://127.0.0.1:33143>: Stream is closed
jung233 commented 1 month ago

Something interesting happened when I tried running it with 8 cores ( the above was 16 cores). It threw a different error, I don't know if that's helpful.

2024-07-19 19:01:22,669 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2024-07-19 19:02:26,356 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks.
2024-07-19 19:02:40,816 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 481.24 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,131 - distributed.worker.memory - WARNING - Worker is at 82% memory usage. Pausing worker.  Process memory: 515.96 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,132 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 515.96 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,136 - distributed.worker.memory - WARNING - Worker is at 82% memory usage. Pausing worker.  Process memory: 516.37 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,136 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 516.37 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,151 - distributed.worker.memory - WARNING - Worker is at 68% memory usage. Resuming worker. Process memory: 425.24 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,158 - distributed.worker.memory - WARNING - Worker is at 67% memory usage. Resuming worker. Process memory: 424.90 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,481 - distributed.worker.memory - WARNING - Worker is at 84% memory usage. Pausing worker.  Process memory: 529.61 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,482 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 529.61 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,484 - distributed.worker.memory - WARNING - Worker is at 81% memory usage. Pausing worker.  Process memory: 506.55 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,484 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 506.55 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,484 - distributed.worker.memory - WARNING - Worker is at 83% memory usage. Pausing worker.  Process memory: 518.85 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,484 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 518.85 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,490 - distributed.worker.memory - WARNING - Worker is at 82% memory usage. Pausing worker.  Process memory: 515.44 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,490 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 515.44 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,494 - distributed.worker.memory - WARNING - Worker is at 82% memory usage. Pausing worker.  Process memory: 517.31 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,494 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 517.31 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,499 - distributed.worker.memory - WARNING - Worker is at 68% memory usage. Resuming worker. Process memory: 427.87 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,529 - distributed.worker.memory - WARNING - Worker is at 68% memory usage. Resuming worker. Process memory: 426.35 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,547 - distributed.worker.memory - WARNING - Worker is at 67% memory usage. Resuming worker. Process memory: 424.48 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,562 - distributed.worker.memory - WARNING - Worker is at 70% memory usage. Resuming worker. Process memory: 438.62 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,687 - distributed.worker.memory - WARNING - Worker is at 82% memory usage. Pausing worker.  Process memory: 516.21 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,699 - distributed.worker.memory - WARNING - Worker is at 82% memory usage. Pausing worker.  Process memory: 515.88 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,749 - distributed.worker.memory - WARNING - Worker is at 68% memory usage. Resuming worker. Process memory: 425.11 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,758 - distributed.worker.memory - WARNING - Worker is at 67% memory usage. Resuming worker. Process memory: 424.77 MiB -- Worker memory limit: 625.00 MiB
2024-07-19 19:02:41,762 - distributed.worker.memory - WARNING - Worker is at 37% memory usage. Resuming worker. Process memory: 233.81 MiB -- Worker memory limit: 625.00 MiB
preparing dask client
parsing input
creating dask graph
not shutting down client, client was created externally
finished
Traceback (most recent call last):
  File "/share/home/user/miniconda3/envs/pyscenic2/bin/pyscenic", line 8, in <module>
    sys.exit(main())
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 713, in main
    args.func(args)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 97, in find_adjacencies_command
    network = method(
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/algo.py", line 39, in grnboost2
    return diy(expression_data=expression_data, regressor_type='GBM', regressor_kwargs=SGBM_KWARGS,
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/algo.py", line 120, in diy
    graph = create_graph(expression_matrix,
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/arboreto/core.py", line 450, in create_graph
    all_meta_df = from_delayed(delayed_meta_dfs, meta=_META_SCHEMA)
  File "/share/home/user/miniconda3/envs/pyscenic2/lib/python3.10/site-packages/dask_expr/io/_delayed.py", line 115, in from_delayed
    raise TypeError("Must supply at least one delayed object")
TypeError: Must supply at least one delayed object