compare_alpha_diversity.py - wrong DF used

gregcaporaso commented 12 years ago

compare_alpha_diversity.py uses the wrong number of degrees of freedom when it performs the t-tests. This results in p-values that are much higher than they should be.

In qiime.compare_alpha_diversity.compare_alpha_diversities, a matrix is built where each row corresponds to a rarefaction iteration and each column corresponds to the alpha diversity score for a sample. Two of these matrices are built for each pair of sample groups (e.g. 'Fast' vs. 'Control') and then they are passed to the cogent.maths.stats.test.t_two_sample function.

t_two_sample does not work correctly when given a matrix (or array, nested list, etc.) because it calculates the number of observations (i.e. numbers) in each group by calling len(), which will not get you the total number of elements in the matrix you're passing in (you'll only get the number of rows). These values are used to calculate the degrees of freedom of the t distribution, so the resulting t test statistic and p-value will be wrong.

I think the easiest way to fix this is to flatten each matrix before passing them to t_two_sample.

gregcaporaso commented 12 years ago

Imported from trac issue 237. Created by jrideout on 2012-10-11T09:55:37, last modified: 2012-10-11T09:55:37

jairideout commented 12 years ago

This issue was fixed in the pull request that was merged.

biocore / qiime

compare_alpha_diversity.py - wrong DF used #237