torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
656 stars 122 forks source link

Option to not ignore terminal gaps when computing consensus #86

Open torognes opened 9 years ago

torognes commented 9 years ago

A user has asked for this option:

When using "-cluster_fast" VSEARCH builds consensus sequences considering terminal gaps in the alignment, so a lot of relevant sequence information on the flanking regions is lost:

Example:

-----CCCAGT
----ACCCAGT
----ACCCAGT
ACTGAACCAGT
____________
    ACCCAGT consensus

Could you add an option that builds the majority consensus but ignores positions wich have no coverage? so that the consensus sequence is:

ACTGACCCAGT
torognes commented 9 years ago

related to #25

frederic-mahe commented 5 years ago

Right now vsearch behaves like usearch and truncates the consensus when there is a majority of gaps, including terminal gaps. That's the default behavior. Should we introduce a --cons_notruncate option to ignore terminal gaps when creating the consensus sequence? (see example above)