Oshlack / JAFFA

JAFFA is a multi-step pipeline that takes either raw RNA-Seq reads, or pre-assembled transcripts, then searches for gene fusions
https://github.com/Oshlack/JAFFA/wiki
Other
83 stars 21 forks source link

Bug reporting #Execution_halted #33

Closed GeorgesBed closed 3 years ago

GeorgesBed commented 7 years ago

Date: 05/02/2017 Reported By: Georges Bedran Email: gbadran_90@live.com

Product: JAFFA Version: 1.08 Platform: CentOS Version: release 6.8 (Final)

Dear developers, I am a bioinformatician working on a proteogenomics project and interested in fusion genes. JAFFA provides a robust and comfortable environment for such analysis and while testing the latest version I encountered 2 bugs that may be due to some versions incompatibilities. I would like to report them and share with you my workarounds. note : The following dataset test was used https://sourceforge.net/projects/fusioncatcher/files/test/

First bug

_script : JAFFAstages.groovy

Is it reproducible: Yes

Description

=========== command : R --no-save --args BT474-demo/BT474-demo.txt BT474-demo/spanning_pair_counts.temp BT474-demo/BT474-demo.reads < /env/cng/proj/projet_G2P_608/scratch/hct116-crc/softs/JAFFA/JAFFA-version-1.08/get_spanning_reads_for_direct_2.R

R version 3.2.4 (2016-03-10) -- "Very Secure Dishes" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) ... ... ... for( n in 1:length(sr[,1])){ entries=candidates$fusion_genes==sr[n,1] spanning_rs=sr[n,2] if(spanning_rs>0){ #distribute evelyn amongst instances of this fusion in the list scounts=table(rep_len(1:sum(entries),spanning_rs)) if(length(scounts)<sum(entries)) scounts=c(scounts,rep(0,sum(entries)-length(scounts))) candidates$spanning_pairs[entries]<-scounts } } Error in if (spanning_rs > 0) { : argument is of length zero Execution halted

Steps to Produce/Reproduce


bpipe run .../JAFFA_hybrid.groovy .../*.gz

Workarounds


replacing line 479 in JAFFA_stages.groovy : echo -e "\$gene\t\$(( \$left + \$right - \$both ))" ; with : echo -e "\$gene \$(( \$left + \$right - \$both ))" ;

since this line is generating a file "spanning_pair_counts.temp" with mixed separators (tabs and spaces) between the 2 columns. Causing a faulty file reading and dataframe generation.

Second bug

script : process_transcriptome_blat_table.R

Is it reproducible: Yes

Description

=========== 47 overs=findOverlaps(ranges,type="within",ignoreSelf=TRUE,ignoreRedundant=TRUE,select="arbitrary")

candidates=do.call("rbind",lapply(split_results,multi_gene))

   # Error: Please use 'drop.self' and/or 'drop.redundant' instead of the
   #   'ignoreSelf' and/or 'ignoreRedundant' arguments.
   # Execution halted

Steps to Produce/Reproduce


bpipe run .../JAFFA_hybrid.groovy .../testDataSet/data1/*.gz

Workarounds


replacing "ignoreSelf=TRUE,ignoreRedundant=TRUE" with "drop.self=TRUE,drop.redundant=TRUE"

overs=findOverlaps(ranges,type="within",drop.self=TRUE,drop.redundant=TRUE,select="arbitrary")

Other Information


R version 3.3.1 GNU bash, version 4.1.2(2)-release (x86_64-redhat-linux-gnu)

nadiadavidson commented 7 years ago

Hi Georges,

We appreciate feedback like this.

For first bug, we've never run into this error before. The spanning_pair_counts.temp should always be tab delimited and should never include spaces, so there must be something else going on which is specific to your system. Are you happy to share your spanning_pair_counts.temp file so I can see what's going on? My guess would be that echo -e behaves differently on you system. If you run: gene="AA" ; left=20 ; right=2 ; both=10 echo -e "$gene\t$(( $left + $right - $both ))" Do you get: AA 12 If not, does running: echo -e "$gene\t"$(( $left + $right - $both )) fix it?

The second bug we've been aware of for a while and in the dev version of JAFFA it has actually already been corrected. I will make a new release shortly with the fixed file.

Cheers, Nadia.

GeorgesBed commented 7 years ago

Hi Nadia, I forgot to mention that i also added the sep =" " option to the read.delim function in the get_spanning_reads_for_direct_2.R script (line 17). So after replacing line 479 in JAFFA_stages.groovy : echo -e "$gene\t$(( $left + $right - $both ))" ; with : echo -e "$gene $(( $left + $right - $both ))" ; I got it working for me. Unfortunately, I do not have the spanning_pair_counts.temp file anymore, but i can relaunch an analysis if you think it will be helpful for you.

Cheers, Georges.

nadiadavidson commented 7 years ago

Thanks Georges,

Yes that would be useful to rerun if you have time. Did you try the modification I proposed above (echo -e "$gene\t"$(( $left + $right - $both )). I'm a bit reluctant to change the format of the output to space separated (for backward compatibility reasons) if there's a simpler solution to fix the error.

Cheers, Nadia.