suhrig / arriba

Fast and accurate gene fusion detection from RNA-Seq data
Other
214 stars 50 forks source link

Identifying gene fusions in plant genomes. #237

Closed jiaanqiang closed 2 months ago

jiaanqiang commented 2 months ago

Thank you for developing such a user-friendly tool.

I want to use Arriba to identify gene fusions in plants, but I don't have the available blacklist.tsv, known_fusions.tsv, and protein_domains.gff3 files. I have only R1.fastq, R1.fastq, genome.fa, and genome.gtf files. What command should I use to execute this task?

jiaanqiang commented 2 months ago

I have only R1.fastq, R2.fastq, genome.fa, and genome.gtf files.

suhrig commented 2 months ago

Arriba can be run without the missing files. You will have more false positives, however (especially read-through transcripts).

Just run the script run_arriba.sh with empty files for the ones that you don't have. Alternatively, you can run STAR manually as shown in the script and then run arriba with the following parameters: arriba -x Aligned.out.bam -o fusions.tsv -O fusions.discarded.tsv -a genome.fa -g genome.gtf -f blacklist. Depending on how the chromosome are named, you will also need to adapt the value of the parameter -i.

jiaanqiang commented 2 months ago

Thank you very much, it's working fine.