Open gregcaporaso opened 8 years ago
+1 for more collapse methods.
I don't think the current version of the code overestimates some sources because the default is to rarify the collapsed source matrix. Thus, even with N sources, the sequence/probability mass contribution will be the same as something 1 source.
On Wed, Mar 30, 2016 at 3:49 PM, Greg Caporaso notifications@github.com wrote:
From @lkursell https://github.com/lkursell on March 24, 2016 17:47
Currently, ST2 adds together sample OTU counts for samples that belong to the same source. This can cause overestimation of contributions from Source environments that have more samples.
One solution would be to be able to pass a --collapse_sources and --collapse_sinks flag with a --collapse_method such as mean or sum. Currently this can be done with the collapse_samples.py script, although that necessitates changes in mapping files and sample names.
_Copied from original issue: biota/sourcetracker2_internal#28 https://github.com/biota/sourcetracker2_internal/issues/28_
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/biota/sourcetracker2/issues/17
From @lkursell on March 24, 2016 17:47
Currently, ST2 adds together sample OTU counts for samples that belong to the same source. This can cause overestimation of contributions from Source environments that have more samples.
One solution would be to be able to pass a
--collapse_sources
and--collapse_sinks
flag with a--collapse_method
such as mean or sum. Currently this can be done with thecollapse_samples.py
script, although that necessitates changes in mapping files and sample names._Copied from original issue: biota/sourcetracker2internal#28