Closed wiltbb closed 2 years ago
The officially supported organisms are human and mouse. That being said, I have heard about users applying the tool to other organisms. As a bare minimum, you need an assembly of the genome (FastA file) and a gene model (GTF file). All else is optional and can simply be omitted when running Arriba (such as the known fusions file, the protein domains file, etc.).
As there is no blacklist for non-supported organisms, you should expect to see quite a few more false positives than usual. Moreover, you need to disable the blacklist when running Arriba using the parameter -f blacklist
or else Arriba will refuse to run.
Thank you Suhrig,now I tried to run arriba but I have a new problem like this:
ERROR: no normal reads found
My script like this:
arriba \ -x star.out \ -a /public/home/zhukun/wheat_transcription/iwgsc_refseqv2.1_assembly.fa -g /public/home/zhukun/wheat_transcription/iwgsc_refseqv2.1_assembly.gtf -o fusions_demo.tsv \ -f blacklist \ -p /public/home/zhukun/wheat_transcription/iwgsc_refseqv2.1_annotation_200916_HC.gff3
star.out is the result of the STAR. Can you help me,thank you very much
Does wheat need the -i
parameter?
Does wheat need the
-i
parameter?
Yes, this is likely the underlying problem. If wheat chromosomes are not named as in human/mouse like chr1
, chr2
, etc., then you must list the chromosome names explicitly, for example: -i 'wheatchrom1 wheatchrom2 wheatchrom3'
Otherwise Arriba only looks for reads mapped to human/mouse chromosomes and does not find any, hence the error message no normal reads found
.
BTW, you can remove this parameter: -p /public/home/zhukun/wheat_transcription/iwgsc_refseqv2.1_annotation_200916_HC.gff3
This file should contain protein domains. I am pretty sure the file you have doesn't, because for human/mouse I have generated the file personally, and I have not generated such a file for the wheat genome.
What is the format of STAR.out
? Note that it is necessary to change the default --chimOutType
of STAR. Please read Input Files.
BTW, you can remove this parameter:
-p /public/home/zhukun/wheat_transcription/iwgsc_refseqv2.1_annotation_200916_HC.gff3
This file should contain protein domains. I am pretty sure the file you have doesn't, because for human/mouse I have generated the file personally, and I have not generated such a file for the wheat genome.
ok,I''try again,thank you sir
1
What is the format of
STAR.out
? Note that it is necessary to change the default--chimOutType
of STAR. Please read Input Files.
STAR.out
is a binary file.
Thank you very much,I solved this problem,Now I've got a bunch of fusion.tsv files,I don't know what to do next, Can you give me some advice? For example,what fusions analytical software is available?
There is some information about interpretation of the output files in the manual, although this is tailed towards fusions in cancer. You may also find the description of the file format useful.
Interpretation depends on what you are looking for. I am not sure of how much help I can be here, because I have never worked with wheat and have only used Arriba in the context of cancer so far. What was your motivation to do this analysis? You probably had some ideas/hypotheses about expected findings.
There is some information about interpretation of the output files in the manual, although this is tailed towards fusions in cancer. You may also find the description of the file format useful.
Interpretation depends on what you are looking for. I am not sure of how much help I can be here, because I have never worked with wheat and have only used Arriba in the context of cancer so far. What was your motivation to do this analysis? You probably had some ideas/hypotheses about expected findings.
Ok,I solved this problem.
I'm looking for the wheat fusion genes. Could I use this software?