aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
163 stars 27 forks source link

problem when run calculate_TFs_to_genes_relationships #335

Closed aaaaaaaaaayong closed 3 months ago

aaaaaaaaaayong commented 3 months ago

context

I read past error reports and knew that this problem should be due to my insufficient memory, but there was still something strange about it.

I have prepared the files (adata,cistopic_obj,menr) by using the data of 140,000 cells, but when I create_SCENICPLUS_object I found that I don't have enough memory,so I downsample cells of adata(scRNA) from 140,000 to 3500. And filter my data by utilizing function: filter_genes(scplus_obj, min_pct = 5) filter_regions(scplus_obj, min_pct = 25)

Finally I create a SCENIC+ object with n_cells x n_genes = 3415 x 9668 and n_cells x n_regions = 3415 x 230226. This data is similar to the data in Tutorial: 10x multiome pbmc and I successfully completed the tutorial before.So I don't know why I'm running out of memory when running my data

Code

calculate_TFs_to_genes_relationships(scplus_obj,
                    tf_file = tf_file,
                    ray_n_cpu = 1,
                    method = 'GBM',
                    _temp_dir = '/data/R03/chenxy957/ray_spill',
                    key= 'TF2G_adj')

Error output

... (raylet) Spilled 10428 MiB, 98 objects, write throughput 452 MiB/s. .... Then it will run at an extremely slow speed and occupy a lot of memory

Version (please complete the following information)

Python: 3.8.18 SCENIC+

SeppeDeWinter commented 3 months ago

Hi @aaaaaaaaaayong

I would strongly recommend to use the development version of the code, this will soon also become the default version. We made quite a lot of updates that reduce the memory requirements. Tutorial on how to use it are already online, see https://scenicplus.readthedocs.io/en/development/tutorials.html.

All the best,

Seppe

aaaaaaaaaayong commented 3 months ago

Hi @SeppeDeWinter

I am using your development version of the code and I think it's a lot greater than before. But when I am using SnakeMake,I encounter a problem : NameError: name 'Tuple' is not defined. Did you mean: 'tuple'? I can't find any relevant information about this type of error report online. So I wanted to ask if you have any suggestions? The only difference between me and the tutorial is that my version of annoy is 1.17.2 instead of 1.17.3

[Tue Mar 26 21:33:55 2024]
Finished job 7.
1 of 12 steps (8%) done
Select jobs to execute...
Execute 1 jobs...

[Tue Mar 26 21:33:55 2024]
localrule prepare_GEX_ACC_multiome:
    input: /data/R03/chenxy957/data/pancan/SCENICplus_new/outs/cistopic_obj.pkl, /data/R03/chenxy957/data/pancan/SCENICplus_new/outs/adata.h5ad
    output: /data/R03/chenxy957/data/pancan/SCENICplus_new/outs/outs/ACC_GEX.h5mu
    jobid: 2
    reason: Missing output files: /data/R03/chenxy957/data/pancan/SCENICplus_new/outs/outs/ACC_GEX.h5mu
    resources: tmpdir=/tmp

2024-03-26 21:34:02,974 SCENIC+      INFO     Reading cisTopic object.
2024-03-26 21:34:05,405 SCENIC+      INFO     Reading gene expression AnnData.
2024-03-26 21:34:09,046 Ingesting multiome data INFO     Found 16724 multiome cells.
2024-03-26 21:34:09,443 cisTopic     INFO     Imputing region accessibility
Traceback (most recent call last):
  File "/data/R03/chenxy957/miniconda3/envs/scenicplus_new/bin/scenicplus", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/data/R03/chenxy957/miniconda3/envs/scenicplus_new/lib/python3.11/site-packages/scenicplus/cli/scenicplus.py", line 1137, in main
    args.func(args)
  File "/data/R03/chenxy957/miniconda3/envs/scenicplus_new/lib/python3.11/site-packages/scenicplus/cli/scenicplus.py", line 44, in command_prepare_GEX_ACC
    prepare_GEX_ACC(
  File "/data/R03/chenxy957/miniconda3/envs/scenicplus_new/lib/python3.11/site-packages/scenicplus/cli/commands.py", line 61, in prepare_GEX_ACC
    mdata = process_multiome_data(
            ^^^^^^^^^^^^^^^^^^^^^^
  File "/data/R03/chenxy957/miniconda3/envs/scenicplus_new/lib/python3.11/site-packages/scenicplus/data_wrangling/adata_cistopic_wrangling.py", line 44, in process_multiome_data
    imputed_acc_obj = impute_accessibility(
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/data/R03/chenxy957/miniconda3/envs/scenicplus_new/lib/python3.11/site-packages/pycisTopic/diff_features.py", line 387, in impute_accessibility
    ) -> Tuple[np.ndarray, list]:
         ^^^^^
NameError: name 'Tuple' is not defined. Did you mean: 'tuple'?

Thank you very much for your generous help and wish you all the best,

Ayong

SeppeDeWinter commented 3 months ago

Hi @aaaaaaaaaayong

Thank you for the nice comments.

I fixed this issue yesterday (https://github.com/aertslab/pycisTopic/commit/5416709112f7324152591e66474bbc67ac553306). Can you try reinstalling SCENIC+?

All the best,

Seppe

aaaaaaaaaayong commented 3 months ago

@SeppeDeWinter

Wow,you are so reliable.

I will try reinstalling SCENIC+.

Thanks for replying so quickly,

All the best,

Ayong

aaaaaaaaaayong commented 3 months ago

@SeppeDeWinter

I successfully finished running the upstream process of the development version of the code.

Next I will do downstream analysis.

Thank you very much for your help and Wish you all the best.

Ayong

SeppeDeWinter commented 3 months ago

That's great!

Good luck with the analyses.

All the best, Seppe