Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
334 stars 80 forks source link

Braker3 parameters for non cannonical splice cite #755

Open y-yoshioka1109 opened 4 months ago

y-yoshioka1109 commented 4 months ago

Hello,

I have to annotate a novel algal genome. This species contains non-cannonical intron (GC-AG and GA-AG in addition to GT-AG).

I used v3.0.7 (isoseq) docker container with proteome and isoseq data. When I run Braker, I added an option "--augustus_args="--allow_hinted_splicesites=gcag,gaag"". Pipleline worked prepectly and I identified introns. However, I could not find non-cannonical introns (GA-AG). I also changed L7652 and L7942 of braker.pl to include "gaag" (like "--allow_hinted_splicesites=gcag,gaag ";). But it did not work. RNa-seq alignment results show that GA-AG splice cite indeed present. Is there any solutions?

I appreciate it if you could help me.

Best regards,

y-yoshioka1109 commented 4 months ago

I understand that I need to modify braker.pl in the container as in the "non-canonical-intron" branch. Thanks.

y-yoshioka1109 commented 4 months ago

I am still struggling with this. Even though I modified some lines of braker.pl in the isoseq docker container (v3.0.7) as in the "non-canonical-intron" branch, the final output (braker.gtf) showed a very lower percentage of GAAG intron, like 0.3% for GAAG, 35% for GCAG, and 65% for GTAG. This is strange for my target species. Isoseq alignment results showed that 15% of donor sites were GAAG, 67% were GCAG, and 18% were GTAG.

Does BRAKER depend on the coverage of isoseq?

I really appreciate it if you could help me.