suhrig / arriba

Fast and accurate gene fusion detection from RNA-Seq data
Other
226 stars 49 forks source link

How I can generate the fastq file of chimeric reads for fusion.tsv? #117

Closed biosciences closed 3 years ago

biosciences commented 3 years ago

I want to know whether I can find out some way to generate the fastq file of chimeric reads for funsion.tsv? I need to validate the quality of the fusion reads and validate them in lab experiments.

suhrig commented 3 years ago

Arriba lists the fusion-supporting reads in the column read_identifiers. There is a script in the develop branch of the repository to extract these reads from the BAM file:

https://github.com/suhrig/arriba/blob/develop/scripts/extract_fusion-supporting_alignments.sh

Run the script without arguments to learn about the usage.

Once you have the extracted reads in BAM format, you can convert them to FastQ using samtools fastq. For example, for paired-end data you can use the following command:

samtools collate -f -O -u -r 1000000 "$BAM_FILE" |
samtools fastq -0 /dev/null -1 read_1.fastq -2 read_2.fastq -s /dev/null