Gaius-Augustus / GALBA

GALBA is a pipeline for fully automated prediction of protein coding gene structures with AUGUSTUS in novel eukaryotic genomes for the scenario where high quality proteins from one or several closely related species are available.
Other
121 stars 4 forks source link

aa2nonred.pl still introduces randomness in training success #19

Open KatharinaHoff opened 1 year ago

KatharinaHoff commented 1 year ago

Replace aa2nonred.pl (which selects randomly one of several highly similar training genes) by a method that identifies rendundancy and then selects the best isoform consistently. This is in GALBA now only affecting paralogues, not alternative transcripts, but it still leads to random accurray drops/unstable training success.