Gaius-Augustus / TSEBRA

TSEBRA: Transcript Selector for BRAKER
46 stars 5 forks source link

Number of genes predicted with TSEBRA is more while BUSCO score remains the same #25

Open ShakunthalaNatarajan opened 1 year ago

ShakunthalaNatarajan commented 1 year ago

Hello! I have used braker for gene prediction with protein of close homology and RNAseq data together (Trial 1) and I ran braker separately with protein of close homology and RNAseq and combined them with TSEBRA . The BUSCO Scores of trial 1 and TSEBRA run are: Trial 1: C: 98.6%[S:96.7%,D:1.9%],F:0.5%,M:0.9%,n:6641 TSEBRA: C:98.6%[S:92.1%,D:6.5%],F:0.3%,M:1.1%,n:6641

But the number of genes in each case is: Trial 1: 12931 TSEBRA: 13353

I calculated the average length of the peptide sequences in both the gene predictions and they turn out to be: Trial 1: 487 amino acids TSEBRA: 478 amino acids

So could it be possible that the gene prediction by TSEBRA is more fragmented?

Is there any other way in which I could choose the best among both of these?

It would be great if you could help. Thank you!