Closed amvarani closed 3 years ago
Hi,
I'm sorry that TSEBRA didn't work properly for your annotation.
The problem could be that the default configuration filters too many transcripts out.
I included a more inclusive configuration into the repository at TSEBRA/config/pref_braker1.cfg
, which you can use instead of the default.cfg
.
I hope this improves your results.
Best, Lars
Hi Lars, Thanks a lot for your reply. However, changing the conf file to "pref_braker1.cfg" the busco results still not good:
C:79.6%[S:76.1%,D:3.5%],F:9.4%,M:11.0%,n:1614
It seems to me that there are quite a few transcripts in your BRAKER results that are not supported by RNA-seq or protein evidence. TSEBRA removes all of these transcripts.
I added another configuration file (keep_ab_initio.cfg
) to the repository that keeps these transcripts.
Hi there! Well, still not good: C:79.0%[S:75.1%,D:3.9%],F:10.8%,M:10.2%,n:1614 Can I send my files for you to take a look, if possible ?
Hi, yes, please send me the files so I can take a look at the issue. My email is lars.gabriel@uni-greifswald.de Best, Lars
Hi there, Finally, with the kindly help of @LarsGab, I have found the problem ! I was using the EvidenceModeler scripts: "augustus_GTF_to_EVM_GFF3.pl" and "gff3_file_to_proteins.pl" to convert the TSEBRA GTF file to GFF3 and them fasta protein format, respectively I noticed that the conversion made by these scripts did not work proper, when we run Braker with the option "--alternatives-from-evidence=true" For a solution, the best strategy is to use the Augustus scripts "gtf2gff.pl" and "gtf2aa.pl", respectively. Using these scripts, I finally got a reasonable BUSCO scores:
C:98.4%[S:93.6%,D:4.8%],F:0.6%,M:1.0%,n:1614
@amvarani Thank you for sharing! It's important to know, because I also use the two perl scripts which you use before to convert the files to measure busco. I will try the way you suggested.
Hi there, I would like to describe my experience using TSEBRA with a plant genome, using BUSCO as benchmark I have a repeat masked genome and BRAKER1 and BRAKER2 annotation results My results are:
BRAKER1: C:97.7%[S:87.2%,D:10.5%],F:1.4%,M:0.9%,n:1614 BRAKER2: C:97.7%[S:72.6%,D:25.1%],F:0.9%,M:1.4%,n:1614
TSEBRA: C:78.4%[S:75.1%,D:3.3%],F:10.8%,M:10.8%,n:1614
The same annotated genome deposited at Phytozome: C:99.7%[S:66.9%,D:32.8%],F:0.1%,M:0.2%,n:1614
Why TSEBRA is messing up the BRAKER1 and BRAKER2 annotation ? Maybe I need to tune up the Configuration File ? Any help ?
Thanks a lot