Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
334 stars 80 forks source link

Using OrthoDB proteins and functional annotation #815

Open enriquepola1996 opened 2 months ago

enriquepola1996 commented 2 months ago

Hello dear BRAKER developers,

  1. I want to annotate a fungal genome for which I do not have RNA-seq evidence, so I intend to use proteins as evidence, however, I have doubts about how to use this option. I have downloaded the Fungi.fa.gz file from https://bioinf.uni-greifswald.de/bioinf/partitioned_odb11/ and I would like to know if I can use this file directly for annotation, let's say something like this:

braker.pl --genome=genome.fa --prot_seq=Fungi.fa

Is it correct to run the annotation this way or do I have to process that file first?

  1. I did a test without evidence and I'm getting two outputs, one from braker and one from Augustus, which one should I use?
[user@server braker]$ ls -lh
total 44M
drwx------ 2 user 4.0K Apr 29 17:37 Augustus
drwxrwxr-x 2 user 4.0K Apr 29 17:37 GeneMark-ES
-rw-rw-r-- 1 user 7.4M Apr 29 17:37 braker.aa
-rw-rw-r-- 1 user  22M Apr 29 17:37 braker.codingseq
-rw-rw-r-- 1 user  14M Apr 29 17:37 braker.gtf
-rw-rw-r-- 1 user 111K Apr 29 17:37 braker.log
drwxrwxr-x 2 user 4.0K Apr 29 17:37 errors
-rw-rw-r-- 1 user 5.7K Apr 29 17:10 genome_header.map
drwxrwxr-x 3 user 4.0K Apr 29 17:29 species
-rw-rw-r-- 1 user 1.4K Apr 29 17:37 what-to-cite.txt

[user@server braker]$ ls -lh Augustus/
total 31M
-rw-rw-r-- 1 user 5.3M Apr 29 17:37 augustus.ab_initio.aa
-rw-rw-r-- 1 user  16M Apr 29 17:37 augustus.ab_initio.codingseq
-rw-rw-r-- 1 user 9.6M Apr 29 17:37 augustus.ab_initio.gtf
  1. The other question is, Is it possible to use the protein file for functional annotation? For example braker.aa or augustus.ab_initio.aa

I would greatly appreciate the advice.

KatharinaHoff commented 1 month ago

You need to gunzip the file.

The braker.aa or augustus.hints.aa can be used as input for functional annotation pipelines.

If you call braker.pl with busco_lineage and compleasm, then BRAKER automatically determines for you what gene set should be braker.aa. If you do not use that argument, you should probably do the statistics on both gene sets yourself, including visualization in a browser, before you choose a gene set.

enriquepola1996 commented 1 month ago

Thank you very much, it has worked very well for my work.