aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
416 stars 178 forks source link

large dataset failed #524

Open wangjiawen2013 opened 8 months ago

wangjiawen2013 commented 8 months ago

Hi, I failed to run grnboost2 on a large dataset with shape 50000 cells X 28000 genes. Then I subsample the dataset to a small one and run grnboost2 sucessfully. It seems that it was caused by something related to dask:

image

Here is an issue on dask and hope it can help to improve pyscenic: https://github.com/dask/distributed/issues/8257

https://lightrun.com/answers/dask-distributed-array--2gb-hitting-msgpack-limit-

https://github.com/dask/distributed/issues/527

wangjiawen2013 commented 4 months ago

Any improvement now ?