SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
167
stars
27
forks
source link
Ray spill out of disk error when using run_pycistarget #78
While running run_pycistarget wrapper on a 160 topic model containing 20k cells I get a out of disk error, while having 600GB of memory, I still need more. Is there a way to limit the disk usage and memory usage?
At the moment I have tried to run with 8 cores on 600GB of memory and 200-300GB of tmp scratch space available.
I this expected for the run or can the memory load be minimized?
While running run_pycistarget wrapper on a 160 topic model containing 20k cells I get a out of disk error, while having 600GB of memory, I still need more. Is there a way to limit the disk usage and memory usage?
At the moment I have tried to run with 8 cores on 600GB of memory and 200-300GB of tmp scratch space available.
I this expected for the run or can the memory load be minimized?
Thanks!
For mm10 these are the dbs I'm using https://resources.aertslab.org/cistarget/databases/mus_musculus/mm10/screen/mc_v10_clust/region_based/mm10_screen_v10_clust.regions_vs_motifs.rankings.feather https://resources.aertslab.org/cistarget/databases/mus_musculus/mm10/screen/mc_v10_clust/region_based/mm10_screen_v10_clust.regions_vs_motifs.scores.feather https://resources.aertslab.org/cistarget/motif2tf/motifs-v10nr_clust-nr.mgi-m0.001-o0.0.tbl
And this is the run code:
from scenicplus.wrappers.run_pycistarget import run_pycistarget run_pycistarget( region_sets = region_sets, species = 'mus_musculus', save_path = os.path.join(work_dir, 'motifs'), ctx_db_path = rankings_db, dem_db_path = scores_db, path_to_motif_annotations = motif_annotation, run_without_promoters = True, n_cpu = 8, _temp_dir = os.path.join(tmp_dir, 'ray_spill'), annotation_version = 'v10nr_clust', )