SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
186
stars
29
forks
source link
ValueError: Length mismatch: Expected axis has 0 elements, new values have 3 elements #490
When running the SCENIC+ pipeline, I encountered a ValueError related to column mismatches during processing. The issue appears to be caused by how the region_names_to_coordinates function is handling the input regions, particularly when reading BED files, leading to an unexpected format that disrupts downstream processing. This error occurs during the second step of the Snakemake pipeline, specifically in the motif_enrichment_cistarget step.
This package is really impressive, and I’m excited to use it for my paper. However, I’ve been struggling for months to get it running, and even our core bioinformatician was unable to resolve all the issues. I would greatly appreciate any help you can provide. Thank you!
Assuming unrestricted shared filesystem usage for local execution.
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 2
Rules claiming more threads will be scaled down.
Job stats:
job count
--------------------------- -------
AUCell_direct 1
AUCell_extended 1
all 1
download_genome_annotations 1
eGRN_direct 1
eGRN_extended 1
get_search_space 1
motif_enrichment_cistarget 1
motif_enrichment_dem 1
prepare_menr 1
region_to_gene 1
scplus_mudata 1
tf_to_gene 1
total 13
Select jobs to execute...
Execute 1 jobs...
[Wed Oct 23 17:18:49 2024]
localrule motif_enrichment_cistarget:
input: /lila/data/niecr/cheongj/ibd/scenicplus/output/region_sets, /lila/data/niecr/cheongj/ibd/scenicplus/mc_v10_clust/gene_based/hg38_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather, /data/niecr/cheongj/ibd/scenicplus/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: /data/niecr/cheongj/ibd/scenicplus/outs/ctx_results.hdf5, /data/niecr/cheongj/ibd/scenicplus/outs/ctx_results.html
jobid: 8
reason: Missing output files: /data/niecr/cheongj/ibd/scenicplus/outs/ctx_results.hdf5
threads: 2
resources: tmpdir=/scratch/lsftmp/10095877.tmpdir
2024-10-23 17:19:29,521 SCENIC+ INFO Reading region sets from: /lila/data/niecr/cheongj/ibd/scenicplus/output/region_sets
2024-10-23 17:19:29,526 SCENIC+ INFO Reading all .bed files in: test
2024-10-23 17:19:57,880 cisTarget INFO Reading cisTarget database
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/externals/loky/process_executor.py", line 463, in _process_worker
r = call_item()
^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/externals/loky/process_executor.py", line 291, in __call__
return self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 589, in __call__
return [func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 589, in <listcomp>
return [func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/cli/commands.py", line 131, in _run_cistarget_single_region_set
ctx_db = cisTargetDatabase(
^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycistarget/motif_enrichment_cistarget.py", line 55, in __init__
self.regions_to_db, self.db_rankings, self.total_regions = self.load_db(fname,
^^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycistarget/motif_enrichment_cistarget.py", line 111, in load_db
target_to_db = target_to_query(region_sets, list(db_regions), fraction_overlap = fraction_overlap)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycistarget/utils.py", line 280, in target_to_query
query_pr=pr.PyRanges(region_names_to_coordinates(query))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycistarget/utils.py", line 35, in region_names_to_coordinates
regiondf.columns=['Chromosome', 'Start', 'End']
^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 5920, in __setattr__
return object.__setattr__(self, name, value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.__set__
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/generic.py", line 822, in _set_axis
self._mgr.set_axis(axis, labels)
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 228, in set_axis
self._validate_set_axis(axis, new_labels)
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/internals/base.py", line 70, in _validate_set_axis
raise ValueError(
ValueError: Length mismatch: Expected axis has 0 elements, new values have 3 elements
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/bin/scenicplus", line 8, in <module>
sys.exit(main())
^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/cli/scenicplus.py", line 1137, in main
args.func(args)
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/cli/scenicplus.py", line 388, in motif_enrichment_cistarget
run_motif_enrichment_cistarget(
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/cli/commands.py", line 242, in run_motif_enrichment_cistarget
cistarget_results: List[cisTarget] = joblib.Parallel(
^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 1952, in __call__
return output if self.return_generator else list(output)
^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 1595, in _get_outputs
yield from self._retrieve()
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 1699, in _retrieve
self._raise_error_fast()
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 1734, in _raise_error_fast
error_job.get_result(self.timeout)
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 736, in get_result
return self._return_or_raise()
^^^^^^^^^^^^^^^^^^^^^^^
File "/data/niecr/cheongj/miniconda3/envs/scenicplus/lib/python3.11/site-packages/joblib/parallel.py", line 754, in _return_or_raise
raise self._result
ValueError: Length mismatch: Expected axis has 0 elements, new values have 3 elements
[Wed Oct 23 17:20:00 2024]
Error in rule motif_enrichment_cistarget:
jobid: 8
input: /lila/data/niecr/cheongj/ibd/scenicplus/output/region_sets, /lila/data/niecr/cheongj/ibd/scenicplus/mc_v10_clust/gene_based/hg38_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather, /data/niecr/cheongj/ibd/scenicplus/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: /data/niecr/cheongj/ibd/scenicplus/outs/ctx_results.hdf5, /data/niecr/cheongj/ibd/scenicplus/outs/ctx_results.html
shell:
scenicplus grn_inference motif_enrichment_cistarget --region_set_folder /lila/data/niecr/cheongj/ibd/scenicplus/output/region_sets --cistarget_db_fname /lila/data/niecr/cheongj/ibd/scenicplus/mc_v10_clust/gene_based/hg38_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather --output_fname_cistarget_result /data/niecr/cheongj/ibd/scenicplus/outs/ctx_results.hdf5 --temp_dir /scratch/cheongj/tmp --species homo_sapiens --fr_overlap_w_ctx_db 0.4 --auc_threshold 0.005 --nes_threshold 3.0 --rank_threshold 0.05 --path_to_motif_annotations /data/niecr/cheongj/ibd/scenicplus/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl --annotation_version v10nr_clust --motif_similarity_fdr 0.001 --orthologous_identity_threshold 0.0 --annotations_to_use Direct_annot Orthology_annot --write_html --output_fname_cistarget_html /data/niecr/cheongj/ibd/scenicplus/outs/ctx_results.html --n_cpu 2
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-23T171849.394334.snakemake.log
WorkflowError:
At least one job did not complete successfully.
Bed files in the region set directory doesn't seem to be the problem.
When running the SCENIC+ pipeline, I encountered a ValueError related to column mismatches during processing. The issue appears to be caused by how the region_names_to_coordinates function is handling the input regions, particularly when reading BED files, leading to an unexpected format that disrupts downstream processing. This error occurs during the second step of the Snakemake pipeline, specifically in the motif_enrichment_cistarget step.
This package is really impressive, and I’m excited to use it for my paper. However, I’ve been struggling for months to get it running, and even our core bioinformatician was unable to resolve all the issues. I would greatly appreciate any help you can provide. Thank you!
versions
PyRanges version: 0.0.111 SCENIC+ version: 1.0a1 Python 3.11.8
error message
Bed files in the region set directory doesn't seem to be the problem.
I did..