matsen / pplacer

Phylogenetic placement and downstream analysis
http://matsen.fredhutch.org/pplacer/
GNU General Public License v3.0
74 stars 18 forks source link

`guppy rarefy` #249

Closed matsen closed 12 years ago

matsen commented 12 years ago

Rarefaction for placements.

Takes an integer and place file(s). Spits out a place file. The integer could be a flag or positional-- I guess a flag makes the most sense to me.

Call the integer n_taken.

Fill a vector with the normalized multiplicities of pqueries. That is, if we have pquery 1 and 2, that have multiplicities 4 and 6, respectively, then make the vector [0.4; 0.6].

Pass this vector, along with the integer n_taken, to the Gsl_randist.multinomial function, to get back an equivalently sized vector of integers. These are the new multiplicities. Any pquery that has multiplicity zero should be filtered out.