Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
367 stars 81 forks source link

Adding training genes? #844

Open Artifice120 opened 4 months ago

Artifice120 commented 4 months ago

Afternoon,

I have completed several runs of Braker on a novel aphid genome with rna-seq reads.

After go-enrichment I found out only about tenth of the reads had a match to go terms for the nearest aphid species.

Is there a way to add the genes that have go annotation support to the Augustus training files/hints since my initial run had less than 80 training genes?

Artifice120 commented 4 months ago

Found a potential solution. Can/should I change the prothint_augustus.gff file from my last run and have the src= value on the contigs I know have trusted go annotation to "M" so it is enforced and used to train augustus ?

Artifice120 commented 4 months ago

I could also add the supported sequences to the protein database I am using but am not sure if that will have the desired effect.

UPDATE: Adding the supported genes had little effect. same amount of GO assignments with new braker annotation.