Gaius-Augustus / GALBA

GALBA is a pipeline for fully automated prediction of protein coding gene structures with AUGUSTUS in novel eukaryotic genomes for the scenario where high quality proteins from one or several closely related species are available.
Other
121 stars 4 forks source link

Iterative training #15

Closed KatharinaHoff closed 1 year ago

KatharinaHoff commented 1 year ago
  1. integration of miniprothint (not optimal, need to avoid running miniprot twice, later)
  2. iterative training of AUGUSTUS

This increases runtime substantially for several reasons: (1) running miniprot 2x, (2) an additional AUGUSTUS with hints run.

TODOs: integrate minprothint better (no double calling of miniprot); training gene extraction from miniprothint; possibly selecting best training genes in a better way; possibly moving train.gb processing steps in the pipeline.