caporaso-lab / sourcetracker2

SourceTracker2
BSD 3-Clause "New" or "Revised" License
62 stars 45 forks source link

More collapse methods for source samples #17

Open gregcaporaso opened 8 years ago

gregcaporaso commented 8 years ago

From @lkursell on March 24, 2016 17:47

Currently, ST2 adds together sample OTU counts for samples that belong to the same source. This can cause overestimation of contributions from Source environments that have more samples.

One solution would be to be able to pass a --collapse_sources and --collapse_sinks flag with a --collapse_method such as mean or sum. Currently this can be done with the collapse_samples.py script, although that necessitates changes in mapping files and sample names.

_Copied from original issue: biota/sourcetracker2internal#28

wdwvt1 commented 8 years ago

+1 for more collapse methods.

I don't think the current version of the code overestimates some sources because the default is to rarify the collapsed source matrix. Thus, even with N sources, the sequence/probability mass contribution will be the same as something 1 source.

On Wed, Mar 30, 2016 at 3:49 PM, Greg Caporaso notifications@github.com wrote:

From @lkursell https://github.com/lkursell on March 24, 2016 17:47

Currently, ST2 adds together sample OTU counts for samples that belong to the same source. This can cause overestimation of contributions from Source environments that have more samples.

One solution would be to be able to pass a --collapse_sources and --collapse_sinks flag with a --collapse_method such as mean or sum. Currently this can be done with the collapse_samples.py script, although that necessitates changes in mapping files and sample names.

_Copied from original issue: biota/sourcetracker2_internal#28 https://github.com/biota/sourcetracker2_internal/issues/28_

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/biota/sourcetracker2/issues/17