biocore / biom-format

The Biological Observation Matrix (BIOM) Format Project
http://biom-format.org
Other
90 stars 95 forks source link

Randomly subsample samples from OTU table #273

Closed josenavas closed 7 years ago

josenavas commented 10 years ago

Porting this issue from QIIME (see qiime/qiime#1385).

From @lkursell:

It would be helpful to be able to randomly subsample Samples from larger groups. For instance, if you were to compare your samples against GG or AG, etc you'd want to pull out X number of samples randomly for your comparison, and then repeat.

wasade commented 10 years ago

This would be fantastic, but I think is out of scope for 2.0. @ElBrogrammer or @gregcaporaso, since this functionality is already in skbio, worth keeping this issue opne?

jairideout commented 10 years ago

The functionality in skbio is for subsampling vectors of counts (e.g., input to alpha diversity measures). Not sure if that'll help here, but agree this is outside of the scope for 2.0.

gregcaporaso commented 10 years ago

Yeah, outside the scope of this release, but it would be ideal to do this using the same code and that should be possible I think.

On Tue, May 13, 2014 at 5:41 PM, Jai Ram Rideout notifications@github.comwrote:

The functionality in skbio is for subsampling vectors of counts (e.g., input to alpha diversity measures). Not sure if that'll help here, but agree this is outside of the scope for 2.0.

Reply to this email directly or view it on GitHubhttps://github.com/biocore/biom-format/issues/273#issuecomment-43030639 .

wasade commented 7 years ago

I believe this is actually already solved with [Table.subsample(..., by_id=True)](https://github.com/biocore/biom-format/blob/2.1.5/biom/table.py#L2498). Please reopen if I'm wrong here