compare_alpha_diversity.py uses the wrong number of degrees of freedom when it performs the t-tests. This results in p-values that are much higher than they should be.
In qiime.compare_alpha_diversity.compare_alpha_diversities, a matrix is built where each row corresponds to a rarefaction iteration and each column corresponds to the alpha diversity score for a sample. Two of these matrices are built for each pair of sample groups (e.g. 'Fast' vs. 'Control') and then they are passed to the cogent.maths.stats.test.t_two_sample function.
t_two_sample does not work correctly when given a matrix (or array, nested list, etc.) because it calculates the number of observations (i.e. numbers) in each group by calling len(), which will not get you the total number of elements in the matrix you're passing
in (you'll only get the number of rows). These values are used to calculate the degrees of freedom of the t distribution, so the resulting t test statistic and p-value will be wrong.
I think the easiest way to fix this is to flatten each matrix before passing them to t_two_sample.
compare_alpha_diversity.py uses the wrong number of degrees of freedom when it performs the t-tests. This results in p-values that are much higher than they should be.
In qiime.compare_alpha_diversity.compare_alpha_diversities, a matrix is built where each row corresponds to a rarefaction iteration and each column corresponds to the alpha diversity score for a sample. Two of these matrices are built for each pair of sample groups (e.g. 'Fast' vs. 'Control') and then they are passed to the cogent.maths.stats.test.t_two_sample function.
t_two_sample does not work correctly when given a matrix (or array, nested list, etc.) because it calculates the number of observations (i.e. numbers) in each group by calling len(), which will not get you the total number of elements in the matrix you're passing in (you'll only get the number of rows). These values are used to calculate the degrees of freedom of the t distribution, so the resulting t test statistic and p-value will be wrong.
I think the easiest way to fix this is to flatten each matrix before passing them to t_two_sample.