Open vcleon88 opened 4 weeks ago
Hi @vcleon88
I got the same issue and was able to resolve it. Your problem is similar to what was mentioned in issue #426
The problem is likely due to the format of your mudata var.names. You can check the format by running the following:
import mudata
mdata = mudata.read(<PATH_TO_ ACC_GEX.h5mu>)
mdata["scATAC"].var_names
The format should be "chr:start-end". In my case it was formatted as "chr-start-end" so I reformatted mdata["scATAC"].var_names to "chr:start-end" and saved it as a new mudata to replace the old one in my out folder. Hope that helps.
Hi @vcleon88
I got the same issue and was able to resolve it. Your problem is similar to what was mentioned in issue #426
The problem is likely due to the format of your mudata var.names. You can check the format by running the following:
import mudata mdata = mudata.read(<PATH_TO_ ACC_GEX.h5mu>) mdata["scATAC"].var_names
The format should be "chr:start-end". In my case it was formatted as "chr-start-end" so I reformatted mdata["scATAC"].var_names to "chr:start-end" and saved it as a new mudata to replace the old one in my out folder. Hope that helps.
Hi @kennethho04
Thank you so much!!!! I solved this problem!!!
I created my own chromsize.tsv: it looks like Chromosome Start End 0 chr1 0 248956422 1 chr2 0 242193529 2 chr3 0 198295559 3 chr4 0 190214555 4 chr5 0 181538259 Index(['Chromosome', 'Start', 'End'], dtype='object')
and the genomo_annotataion.tsv with filtered chromosome be like: Chromosome Start End Strand Gene Transcription_Start_Site \ 0 chrM 3307 4262 + MT-ND1 3307
1 chrM 4470 5511 + MT-ND2 4470
2 chrM 5904 7445 + MT-CO1 5904
3 chrM 7586 8269 + MT-CO2 7586
4 chrM 8366 8572 + MT-ATP8 8366
Transcript_type
0 protein_coding
1 protein_coding
2 protein_coding
3 protein_coding
4 protein_coding
Index(['Chromosome', 'Start', 'End', 'Strand', 'Gene', 'Transcription_Start_Site', 'Transcript_type'], dtype='object')
however when i run the Snakemake the error comes again
:~/scplus_pipeline/Snakemake$ Assuming unrestricted shared filesystem usage for local execution. Building DAG of jobs... Using shell: /bin/bash Provided cores: 40 Rules claiming more threads will be scaled down. Job stats: job count
AUCell_direct 1 AUCell_extended 1 all 1 eGRN_direct 1 eGRN_extended 1 get_search_space 1 motif_enrichment_dem 1 prepare_menr 1 region_to_gene 1 scplus_mudata 1 tf_to_gene 1 total 11
Select jobs to execute... Execute 1 jobs...
[Thu Oct 24 15:57:03 2024] localrule get_search_space: input: /home/gu/scecis/plusout/ACC_GEX.h5mu, /home/gu/scecis/plusout/genome_annotation.tsv, /home/gu/scecis/plusout/chromsizes.tsv output: /home/gu/scecis/plusout/search_space.tsv jobid: 11 reason: Missing output files: /home/gu/scecis/plusout/search_space.tsv resources: tmpdir=/tmp
2024-10-24 15:57:08,201 SCENIC+ INFO Reading data (scenicplus) gu@s166:~/scplus_pipeline/Snakemake$ /home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/anndata/_core/anndata.py:522: FutureWarning: The dtype argument is deprecated and will be removed in late 2024. warnings.warn( /home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/anndata/_core/anndata.py:522: FutureWarning: The dtype argument is deprecated and will be removed in late 2024. warnings.warn( Traceback (most recent call last): File "/home/gu/miniconda3/envs/scenicplus/bin/scenicplus", line 8, in
sys.exit(main())
^^^^^^
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/cli/scenicplus.py", line 1137, in main
args.func(args)
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/cli/scenicplus.py", line 208, in search_space
get_search_space_command(
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/cli/commands.py", line 661, in get_search_space_command
search_space = get_search_space(
^^^^^^^^^^^^^^^^^
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/data_wrangling/gene_search_space.py", line 294, in get_search_space
pr_regions = pr.PyRanges(region_names_to_coordinates(scplus_region))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/utils.py", line 223, in region_names_to_coordinates
regiondf.columns = ['Chromosome', 'Start', 'End']
^^^^^^^^^^^^^^^^
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 5920, in setattr
return object.setattr(self, name, value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.set
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 822, in _set_axis
self._mgr.set_axis(axis, labels)
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 228, in set_axis
self._validate_set_axis(axis, new_labels)
File "/home/gu/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/base.py", line 70, in _validate_set_axis
raise ValueError(
ValueError: Length mismatch: Expected axis has 0 elements, new values have 3 elements
[Thu Oct 24 15:57:18 2024]
Error in rule get_search_space:
jobid: 11
input: /home/gu/scecis/plusout/ACC_GEX.h5mu, /home/gu/scecis/plusout/genome_annotation.tsv, /home/gu/scecis/plusout/chromsizes.tsv
output: /home/gu/scecis/plusout/search_space.tsv
shell:
Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2024-10-24T155703.261548.snakemake.log WorkflowError: At least one job did not complete successfully.
is there anyone have idea of this issue ?
Thanks in advance.