matsen / pplacer

Phylogenetic placement and downstream analysis
http://matsen.fredhutch.org/pplacer/
GNU General Public License v3.0
74 stars 18 forks source link

add pre-masking to nbc classifier #248

Closed matsen closed 12 years ago

matsen commented 12 years ago

Following Werner, 2011, we should trim the training set to the region being sequenced.

For us, that simply means finding the mask from the aligned sequences like for pplacer, then applying this mask to the (aligned) reference sequences before de-aligning and adding them to the table of word counts.

matsen commented 12 years ago

This should be optional, with a --pre-mask flag.