Oshlack / JAFFA

JAFFA is a multi-step pipeline that takes either raw RNA-Seq reads, or pre-assembled transcripts, then searches for gene fusions
https://github.com/Oshlack/JAFFA/wiki
Other
87 stars 21 forks source link

Errors when running multiple files JAFFAL #80

Closed EduardoGCCM closed 2 years ago

EduardoGCCM commented 2 years ago

Hi Nadia, I have been running JAFFAL for some ONT data we have generated. Each sample has about 50 million reads and is splited in fastq files of 4000 reads each. When running JAFFAL in several files at a time some files run without problems while others give a mix of errors: 1, 137 and 139.

I have re-run random files that gave error 137, 139 and they run without problem when I run only that file. That is not the case for samples with error 1. E.g.:

======================== Stage get_final_list (PAH69264_pass_c19e5662_4111) ======================== R version 3.2.2 (2015-08-14) -- "Fire Safety" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. During startup - Warning messages: 1: Setting LC_CTYPE failed, using "C" 2: Setting LC_COLLATE failed, using "C" 3: Setting LC_TIME failed, using "C" 4: Setting LC_MESSAGES failed, using "C" 5: Setting LC_MONETARY failed, using "C" 6: Setting LC_PAPER failed, using "C"

options(echo=FALSE) options(echo=FALSE) 7: Setting LC_MEASUREMENT failed, using "C" [1] "Getting the location of fusion transcripts in the genome.." Calculating gap size in the genome... Checking if the fusions are in frame... Merging with read coverage data... Reassigning Low Confidence breakpoints Error in split.default(1:dim(cand)[1], cand$fusion_genes) : group length is 0 but data length > 0 Calls: split -> split.default Execution halted ERROR: Command failed with exit status = 1 :

if [ ! -s PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq_genome.psl ] ; then touch PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq.summary ; else /home/ssd/egomez/miniconda3/envs/jaffa/bin/R --vanilla --args PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq_genome.psl PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq.reads /home/ssd/egomez/archive_ontTAS/JAFFA-version-2.2/hg38_genCode22.tab /home/ssd/egomez/archive_ontTAS/JAFFA-version-2.2/known_fusions.txt 10000 NoSupport,PotentialRunThrough 50 PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq.summary < /home/ssd/egomez/archive_ontTAS/JAFFA-version-2.2/make_final_table.R ; fi;

========================================= Pipeline Failed ==========================================

One or more parallel stages aborted. The following messages were reported:

Branch PAH69264_pass_c19e5662_4111.fastq in stage Unknown reported message:

Command failed with exit status = 1 :

if [ ! -s PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq_genome.psl ] ; then touch PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq.summary ; else /home/ssd/egomez/miniconda3/envs/jaffa/bin/R --vanilla --args PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq_genome.psl PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq.reads /home/ssd/egomez/archive_ontTAS/JAFFA-version-2.2/hg38_genCode22.tab /home/ssd/egomez/archive_ontTAS/JAFFA-version-2.2/known_fusions.txt 10000 NoSupport,PotentialRunThrough 50 PAH69264_pass_c19e5662_4111.fastq/PAH69264_pass_c19e5662_4111.fastq.summary < /home/ssd/egomez/archive_ontTAS/JAFFA-version-2.2/make_final_table.R ; fi;

Use 'bpipe errors' to see output from failed commands.

What do you think might be causing the errors?

Many thanks! Eduardo

EduardoGCCM commented 2 years ago

The sample I have is a mixture of human and murine cells (we are trying to investigate limit of detection for rare cell populations), however, I am using only the human genome for the alignment in JAFFAL (I had problems setting it up with the combined genomes). Could the error be related to a lack of high quality alignments for that particular file (e.g. all the reads belong to murine cells)?

nadiadavidson commented 2 years ago

Hi Eduardo, Yes, I think these errors are a result of no fusions being found. The pipeline should really exit nicely in these cases, so I will check what's going on. If you can place all the reads in one (or several) large .fastq files that would likely fix the issue. Cheers, Nadia.

EduardoGCCM commented 2 years ago

Hi Nadia, Thank you for your response. I have just done that and you are right, it seems to fix the issue.

Eduardo