Closed leahyekim closed 5 months ago
Hi @leahyekim ,
I think once I had similar issue, please check in the folder region_sets/DARs_cell_type
or in region_sets/Topics_top_3k
(or whichever name your folders have there) that you do not have empty .bed
files. After I removed mine, this function worked.
I had empty bed files because one of my cell types had too little cells, and no DARs were found for it. At least, this is my explanation to it.
Best, Dmitrii
Hi @DmitriiSeverinov,
Thank you for your reply. Yes, you're right. There were some empty .bed files in that folder!
After removing those empty .bed files, the Snakemake pipeline got past that error.
Thank you so much!
Hello! Thank you so much for developing such an amazing tool :)
I would like get some clarification on what
input_data
should be for running the snakemake pipeline (Tutorial Step 4: Run SCENIC+ using Snakemake).When I first ran this,
!bat scplus_pipeline/Snakemake/config/config.yaml
bat command was not found (I don't have the error output anymore, sorry).Instead, I was able to see the output when I did
!cat scplus_pipeline/Snakemake/config/config.yaml
However, the path to my input_data were all empty.
So I ended up editing my config.yaml file by manually putting the path to each input files: The path to my config.yaml file is:
/home1/yeeunk/scenicplus/scplus_pipeline/Snakemake/config/config.yaml
Then I ran
snakemake --cores 20
I just ran this because I thought the source and activating the conda environment is not necessary. I have my scenicplus environment activated in my jupyter notebook.And I got this error (I copied and pasted the error at the bottom so that other people can search for it as well)
I think there is something wrong with my .feather file which was generated after creating the custom cistarget database. The code that I used to generate that database is:
The output of that code was exactly the same as on the documentation here : https://scenicplus.readthedocs.io/en/latest/human_cerebellum_ctx_db.html
I also took a look at what .tbl file looks like (I'm working with a mouse dataset)
And my regions_vs_motifs.rankings.feather :
And my regions_vs_motifs.scores.feather :
It seems like my input_data is fine for running rule motif_enrichment_cistarget according to your raw code here: Line 51-54 of https://github.com/aertslab/scenicplus/blob/main/src/scenicplus/snakemake/Snakefile
Could you please help me understand why I am not getting the output file of rule motif_enrichment_cistarget? I appreciate any guidance to debug this. Thank you so much!!
Version (please complete the following information):
[Pasting the error output below as well so that other people could search it] Assuming unrestricted shared filesystem usage for local execution. Building DAG of jobs... Using shell: /bin/bash Provided cores: 20 Rules claiming more threads will be scaled down. Job stats: job count
AUCell_direct 1 AUCell_extended 1 all 1 download_genome_annotations 1 eGRN_direct 1 eGRN_extended 1 get_search_space 1 motif_enrichment_cistarget 1 motif_enrichment_dem 1 prepare_GEX_ACC_multiome 1 prepare_menr 1 region_to_gene 1 scplus_mudata 1 tf_to_gene 1 total 14
Select jobs to execute... Execute 1 jobs...
[Fri May 24 10:24:14 2024] localrule motif_enrichment_cistarget: input: /home1/yeeunk/scenic+_Mar2024/outs/region_sets, /home1/yeeunk/scenic+_Mar2024/outs/region_sets/multiome_P1coc.regions_vs_motifs.rankings.feather, /home1/yeeunk/scenic+_Mar2024/outs/region_sets/motifs-v10-nr.mgi-m0.00001-o0.0.tbl output: /home1/yeeunk/scenicplus/outs/ctx_results.hdf5, /home1/yeeunk/scenicplus/outs/ctx_results.html jobid: 9 reason: Missing output files: /home1/yeeunk/scenicplus/outs/ctx_results.hdf5 threads: 20 resources: tmpdir=/tmp/SLURM_23457528
2024-05-24 10:24:24,015 SCENIC+ INFO Reading region sets from: /home1/yeeunk/scenic+_Mar2024/outs/region_sets 2024-05-24 10:24:24,015 SCENIC+ INFO Reading all .bed files in: Topics_top_3k 2024-05-24 10:24:24,166 SCENIC+ INFO Reading all .bed files in: Topics_otsu 2024-05-24 10:24:24,531 SCENIC+ INFO Reading all .bed files in: DARs_cell_type Traceback (most recent call last): File "/home1/yeeunk/.conda/envs/scenicplus/bin/scenicplus", line 8, in
sys.exit(main())
^^^^^^
File "/home1/yeeunk/scenicplus/src/scenicplus/cli/scenicplus.py", line 1137, in main
args.func(args)
File "/home1/yeeunk/scenicplus/src/scenicplus/cli/scenicplus.py", line 386, in motif_enrichment_cistarget
run_motif_enrichment_cistarget(
File "/home1/yeeunk/scenicplus/src/scenicplus/cli/commands.py", line 149, in run_motif_enrichment_cistarget
region_set_dict[key_name] = pr.read_bed(
^^^^^^^^^^^^
File "/home1/yeeunk/.conda/envs/scenicplus/lib/python3.11/site-packages/pyranges/readers.py", line 79, in read_bed
first_start = open(f).readline().split()[1]