ogotoh / spaln

Genome mapping and spliced alignment of cDNA or amino acid sequences
GNU General Public License v2.0
94 stars 16 forks source link

Question: Is SPALN aware of soft-masked repeats? #31

Closed marchoeppner closed 3 years ago

marchoeppner commented 3 years ago

Hi,

I am building a genome annotation pipeline and was wondering if Spaln is at all aware of soft-masked regions in a genome sequence? This information is sometimes used (see Exonerate) to prevent the seeding of alignments in repeats, if I am not mistaken.

Thanks for the clarification! /M

ogotoh commented 3 years ago

No, Spaln isn't aware of soft-masked regions, which are treated as normal sequences. Excessively large number of k-mers are down scored to find candidate genic regions. Accordingly, repeated genomic regions are likely (but not completely) to be eliminated from candidate regions. Repeat masking is a double-edged sword. In my experience, spaln generally works better without repeat masking.

Osamu,