gagneurlab / drop

Pipeline to find aberrant events in RNA-Seq data, useful for diagnosis of rare disorders
MIT License
128 stars 43 forks source link

Pipeline FAILS when specifying subsets of genes to test #549

Open GaberBergant opened 1 month ago

GaberBergant commented 1 month ago

Hi,

thank you for a great piece of software. I am trying to use DROP on my cohort of patients, and I had no issues getting the pipeline to complete, however, when I try to specify a subset of genes to test I hit an issue with the pipeline failing at this step: AberrantSplicing_pipeline_Counting_01_2_countRNA_splitReads_merge_R

I am pasting the error and appending the config, sample annotation and geneset files. Any help would be very much appreciated!

sample_annotation.txt config.txt myopathies3panels.txt

NB: Patient IDs have been masked and phenotype (HPO terms) removed for privacy reasons. File extensions have been modified from myopathies3panel.yaml, config.yaml and sample_annotation.tsv to *.txt, since I can not append .tsv and .yaml files here.

[Fri May 24 13:54:00 2024] rule AberrantSplicing_pipeline_Counting_01_2_countRNA_splitReads_merge_R: input: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P033.done, /home/gbergantatasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P034.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-Aome/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P036.done, /home/gbergant/DROP/outputs/raw-local-AS/sample_tmp/splitCounts/sample_P028.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P032.done, /home/gbergant/DROP/outputs/processed_da/sample_tmp/splitCounts/sample_P038.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sprocessed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P039.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splitCounts/sample_P035.done, Scripts/AberrantSplicing/pipeline/Counting/01_2_countRNA_splitReads_merge.R output: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/savedObjects/raw-local-AS/rawCountsJ.h5, /home/gbergant/DROP/outputs/processal-AS/gRanges_splitCounts.rds, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/gRanges_NonSplitCounts.rds, /home/gberng/datasets/cache/raw-local-AS/spliceSites_splitCounts.rds log: /home/gbergant/DROP/drop_wd/.drop/tmp/AS/AS/01_2_splitReadsMerge.Rds jobid: 45 reason: Missing output files: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/gRanges_NonSplitCounts.rds, /home/gicing/datasets/cache/raw-local-AS/spliceSites_splitCounts.rds, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/gRanger job: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P034.done, /home/gbergant/DROets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P036.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sagbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P032.done, /home/gbergant/DROP/outputs/pro-local-AS/sample_tmp/splitCounts/sample_P039.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/spli/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P033.done, /home/gbergant/DROP/outputs/processed_data/aple_tmp/splitCounts/sample_P029.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/samplessed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P037.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicCounts/sample_P035.done wildcards: dataset=AS threads: 12 resources: tmpdir=/tmp

Rscript --vanilla /home/gbergant/DROP/drop_wd/.snakemake/scripts/tmplenf2bhp.01_2_countRNA_splitReads_merge.R Load packages Fri May 24 13:54:17 2024: Start counting the split reads ... Fri May 24 13:54:17 2024: Count split reads for sample: P029

Fri May 24 13:54:18 2024: Count split reads for sample: P037

Fri May 24 13:54:17 2024: Count split reads for sample: P033

Fri May 24 13:54:18 2024: Count split reads for sample: P039

Fri May 24 13:54:17 2024: Count split reads for sample: P030

Fri May 24 13:54:18 2024: Count split reads for sample: P038

Fri May 24 13:54:18 2024: Count split reads for sample: P036

Fri May 24 13:54:17 2024: Count split reads for sample: P034

Fri May 24 13:54:17 2024: Count split reads for sample: P028

Fri May 24 13:54:17 2024: Count split reads for sample: P032

Fri May 24 13:54:18 2024: Count split reads for sample: P035

Fri May 24 20:06:28 2024 : count ranges need to be merged ... Fri May 24 20:06:30 2024: Create splice site indices ... Fri May 24 20:06:30 2024: Writing split counts to folder: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/savedObjects/raw-local-AS/spli Fri May 24 20:06:30 2024: Create splice site indices ... Error in rowAutoGrid(x, nrow = block_nrow) : 'nrow' must be a single integer or NULL Calls: rowMaxs ... hstrip_apply -> best_grid_for_hstrip_apply -> rowAutoGrid In addition: Warning message: In best_grid_for_hstrip_apply(x, grid) : NAs introduced by coercion to integer range Execution halted [Fri May 24 20:06:31 2024] Error in rule AberrantSplicing_pipeline_Counting_01_2_countRNA_splitReads_merge_R: jobid: 45 input: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P033.done, /home/gbergantatasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P034.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-Aome/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P036.done, /home/gbergant/DROP/outputs/raw-local-AS/sample_tmp/splitCounts/sample_P028.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P032.done, /home/gbergant/DROP/outputs/processed_da/sample_tmp/splitCounts/sample_P038.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sprocessed_data/aberrant_splicing/datasets/cache/raw-local-AS/sample_tmp/splitCounts/sample_P039.done, /home/gbergant/DROP/outputs/processed_data/aberrant_splitCounts/sample_P035.done, Scripts/AberrantSplicing/pipeline/Counting/01_2_countRNA_splitReads_merge.R output: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/savedObjects/raw-local-AS/rawCountsJ.h5, /home/gbergant/DROP/outputs/processal-AS/gRanges_splitCounts.rds, /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/cache/raw-local-AS/gRanges_NonSplitCounts.rds, /home/gberng/datasets/cache/raw-local-AS/spliceSites_splitCounts.rds log: /home/gbergant/DROP/drop_wd/.drop/tmp/AS/AS/01_2_splitReadsMerge.Rds (check log file(s) for error details)

RuleException: CalledProcessError in file /tmp/tmp5kipsl2z, line 139: Command 'set -euo pipefail; Rscript --vanilla /home/gbergant/DROP/drop_wd/.snakemake/scripts/tmplenf2bhp.01_2_countRNA_splitReads_merge.R' returned non-zero File "/tmp/tmp5kipsl2z", line 139, in __rule_AberrantSplicing_pipeline_Counting_01_2_countRNA_splitReads_merge_R File "/home/gbergant/.local/bin/miniconda/envs/drop_env_133/lib/python3.8/concurrent/futures/thread.py", line 57, in run Removing output files of failed job AberrantSplicing_pipeline_Counting_01_2_countRNA_splitReads_merge_R since they might be corrupted: /home/gbergant/DROP/outputs/processed_data/aberrant_splicing/datasets/savedObjects/raw-local-AS/rawCountsJ.h5, /home/gbergant/DROP/outputs/processed_data/abers_splitCounts.rds Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2024-05-24T082708.015001.snakemake.log

ischeller commented 1 month ago

Hi @GaberBergant , I'm unfortunately not sure, why this is happening for you, we have not encountered this error before. As far as I can see, it should not be related to the gene subset that you provided, as they are not at all used at the point where the error occurs. One thing you could try is deleting the cache data of the counting data of the samples from the previous run of the splicing module, and completely rerun the splicing module from the start. To try this, the folder to delete would be: {DROP_root}/processed_data/aberrant_splicing/datasets/cache/

Let us know if this helped or if the problem persists.