Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
364 stars 81 forks source link

Truncated gene models #706

Closed rpetroll closed 11 months ago

rpetroll commented 1 year ago

Hi,

I have used braker3 for the annotation of an algal genome and provided RNA-seq data as well as protein sequences of closely related algal species. Braker was running without any problems, but I am a bit unsure about the interpretation of the resulting annotated genes. When I am checking the genes out of braker.gtf in a genome browser, to me it looks like most of the gene models are truncated, since the RNA-seq blocks span much longer than the annotated gene (see attached picture). And this is the case for mainly all genes in the annotation. Since I am very new to gene annotations, I would be very thankful about any suggestions or comments! Are there specific parameter and options I could adjust?

Thank you!

data_braker

KatharinaHoff commented 1 year ago

I recommend visualization of the intron hints that are in the hintsfile of BRAKER. These will show you where introns are theoretically possible. Without that information, it's difficult to say whether g6329.t1 is indeed truncated. If there are not intron hints, zoom in and check whether there's a very rare splice site pattern that BRAKER cannot handle by default. Also check whether there's a possible ORF - if there is no possible longer ORF, BRAKER cannot predict a longer gene in that location.