gatech-genemark / ProtHint

Protein hint generation pipeline for gene finding in eukaryotic genomes
Other
56 stars 13 forks source link

Genome is softmasked? #29

Closed pjx1990 closed 3 years ago

pjx1990 commented 3 years ago

Is the genome sequence using a softmasked genome? Thx

tomasbruna commented 3 years ago

Softmasking is used by GeneMark-ES to predict seed genes (as illustrated here). Softmasking generally improves the accuracy of GeneMark-ES and this improvement can result in better final predictions.

The rest of ProtHint ignores softmasking. If you are getting gene seeds separately (and giving them to ProtHint with --geneSeeds option), it does not matter whether the genome is softmasked.