joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
584 stars 187 forks source link

multitest #439

Closed naarkhoo closed 9 years ago

naarkhoo commented 9 years ago

I wonder why multitest is used for phyloseq; since It seems the result is suject to the value of "B". What is the advantage of premutation. why not simply using a test (F-test, wilcoxon, ...) and correcting it ?! Thank you

spholmes commented 9 years ago

Correcting it is the point, the standard multiple hypothesis correction algorithms use B permutations, if you take B large enough it will have no influence on the results. Please see the literature on multiple hypothesis testing here for instance: http://link.springer.com/chapter/10.1007/0-387-29362-0_15#page-1 or here: Westfall, Peter H. Resampling-based multiple testing: Examples and methods for p-value adjustment. Vol. 279. John Wiley & Sons, 1993.

On Sat, Mar 7, 2015 at 9:38 AM, naarkhoo notifications@github.com wrote:

I wonder why multitest is used for phyloseq; since It seems the result is suject to the value of "B". What is the advantage of premutation. why not simply using a test (F-test, wilcoxon, ...) and correcting it ?! Thank you

— Reply to this email directly or view it on GitHub https://github.com/joey711/phyloseq/issues/439.

Susan Holmes Professor, Statistics and BioX Director, MCS Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/

joey711 commented 9 years ago

I agree with @spholmes. I would also like to add that multtest was included very early in phyloseq's development and is still included mostly for backward-compatibility and rare, special cases. I would even be amenable to deprecating it because of the confusion its presence in the package appears to be causing. Though a robust improvement over a standard f-test, we generally do not recommend the use of multtest as a means of detecting differential abundance of species/taxa/OTUs in amplicon sequencing data. Instead we suggest that investigators use a mature implementation of Negative Binomial Wald test, like that provided through our interface to DESeq2.

The point here being if you use a bad, bias, or sub-optimal test, p-value correction is not going to save you. It simply saves you from the additional Type-1 error introduced by doing many bad, bias, or sub-optimal tests.

See our explanation in PLoS Computational Biology for further details:

http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003531

I will close this issue for now. Thanks for your feedback and interest in phyloseq!

joey