suhrig / arriba

Fast and accurate gene fusion detection from RNA-Seq data
Other
226 stars 49 forks source link

Read identifiers not appearing in fusions.discarded.tsv #164

Closed 14zac2 closed 2 years ago

14zac2 commented 2 years ago

Hello,

I am interested in looking at my discarded fusion transcripts in addition to the filtered ones. I would like all of the same information in fusions.discarded.tsv that can be found in fusions.tsv. However, the columns for fusion_transcript, peptide_sequence, and read_identifiers all appear as column names but the information stored within the column is a . for every entry.

I am using Arriba version 2.3.0 and am using the following command to extract reads from a bam file created by STARsolo:

arriba \
-x ../starSolo_whitelist_stringent_output/Aligned.out.bam \
-f "blacklist intronic relative_support homologs read_through short_anchor in_vitro intragenic_exonic low_coverage_viral_contigs hairpin end_to_end" \
-g /data/zoe_analysis/sc2_ortho_mito_virus_chrom/genes/genes.gtf \
-a /data/zoe_analysis/sc2_ortho_mito_virus_chrom/fasta/genome.fa \
-o fusions.tsv \
-O fusions.discarded.tsv \
-i "*" -I -T 10 -S 1 -F 98 -A 10 -U 32767 -u

I have tried specifying the -I tag twice as recommended in the manual, and this worked with a previous version of Arriba, but this only creates an error saying that -I is specified too often.

Many thanks, Zoe

suhrig commented 2 years ago

Hi Zoe,

There were some changes to the command line options with the release of version 2.0.0. The options -I, -T, and -P were removed or repurposed. If you want read identifiers in the discarded fusions file, you need to activate the switch -X. Note that this will not only cause Arriba to write read identifiers to the discarded file, but also transcript sequences and peptide sequences. There is no way to enable just one of these functions selectively anymore.

The decision to remove these parameters was made, because users tended to forget to enable them, and were frustrated when this information was missing in the output file. So since version 2, the functions of these (now obsolete) parameters is enabled by default, rendering their existence futile.

Regards, Sebastian

14zac2 commented 2 years ago

Hi Sebastian - thanks so much for this clarification! Not sure how I completely missed the -X flag. Everything works great now!

Many thanks, Zoe