aldex.ttest for paired testing

sbslee commented 3 years ago

Hello @ggloor,

This is just a quick follow-up to the recently closed issue #33 where you mentioned:

to run a paired t-test, the samples must be in order of pairing.

So A1,B1,A2,B2 etc or A1,A2,A3,... B1,B2,B3 ...

I'm just trying to see if I fully understand how to use the aldex.ttest function in the context of paired testing.

For this, let us use the example given for the aldex.ttest function in the manual as shown below.

data(selex)
#subset for efficiency
selex <- selex[1201:1600,]
conds <- c(rep("NS", 7), rep("S", 7))
x <- aldex.clr(selex, conds, mc.samples=2, denom="all")
ttest.test <- aldex.ttest(x)

Note that this example does NOT use paired testing. But let's keep going...

Here, the selex dataframe contains 14 samples and 400 features. The samples have a condition of either "NS" or "S".

> dim(selex)
[1] 400  14
> head(selex)
        X1_ANS X1_BNS X1_CNS X1_DNS X2_ANS X2_CNS X2_DNS X1_AS X1_BS X1_CS X1_DS X2_AS X2_CS X2_DS
S:D:A:D    524    355    443    489    465    509    754     0     0     0    13   675     1     4
S:D:A:E    588    383    564    462    559    564    961     5     5    11   437    10     4     1
S:E:A:D    596    318    542    443    605    459   1022    77    44     8    12     4     2    89
S:E:A:E    535    352    549    514    555    465   1476   718   168    76   459    10    31     5
S:D:C:D    218    104    192    193    177    190    709     0     0     0     0     1     0     0
S:D:C:E    269    180    151    234    281    269    467     1     0     0     4     0     0     0
> print(conds)
 [1] "NS" "NS" "NS" "NS" "NS" "NS" "NS" "S"  "S"  "S"  "S"  "S"  "S"  "S"

Now, we can see which features were determined to be significantly differentially abundant by ALDEx2:

> head(ttest.test)
              we.ep     we.eBH        wi.ep     wi.eBH
S:D:A:D 0.651204830 0.87156196 0.7103729604 0.83246355
S:D:A:E 0.045328870 0.28907132 0.0134032634 0.07872605
S:E:A:D 0.000968208 0.02284082 0.0005827506 0.00971251
S:E:A:E 0.001772013 0.03488500 0.0005827506 0.00971251
S:D:C:D 0.198767277 0.53778034 0.3717948718 0.63634844
S:D:C:E 0.427278491 0.71021087 0.4396853147 0.66512436

So far so good. But now, imagine I want to perform paired testing with this dataset. For that, I can do the following:

> ttest.test.paired <- aldex.ttest(x, paired.test=TRUE)
> head(ttest.test.paired)
              we.ep     we.eBH    wi.ep    wi.eBH
S:D:A:D 0.650135703 0.86370857 0.843750 0.9629630
S:D:A:E 0.040323193 0.27514227 0.031250 0.2623663
S:E:A:D 0.001037489 0.02636482 0.015625 0.1895680
S:E:A:E 0.002243459 0.04623270 0.015625 0.1895680
S:D:C:D 0.176885684 0.50431573 0.296875 0.6520539
S:D:C:E 0.413937029 0.69620555 0.453125 0.7526981

Basically, above is comparing the samples in the following way:

"NS" group vs. "S" group
      X1_ANS vs. X1_AS
      X1_BNS vs. X1_BS
      X1_CNS vs. X1_CS
      X1_DNS vs. X1_DS 
      X2_ANS vs. X2_AS
      X2_CNS vs. X2_CS
      X2_DNS vs. X2_DS

And now my questions:

Q1. Could you kindly confirm my understanding of the `aldex.ttest' function for paired testing is correct (as described above)?

Q2. Could you confirm aldex.ttest(x, paired.test=TRUE) will output results of paired Welch’s t-test (i.e. we.ep) and Wilcoxon signed-rank test (i.e. wi.ep)?

Thank you very much for reading this. I'm very excited to try out the aldex.ttest function for my own dataset (microbiome data from normal tissue vs. tumor tissue from cancer patients).

ggloor commented 3 years ago

The aldex.ttest functions as you described it above. It uses the order of samples to set up the pairing relationships.

The aldex.ttest returns the expected value across the MC samples of the corresponding tests in multtest:mt.teststat. So the tests are returning what you indicated, except they are average values

sbslee commented 3 years ago

@ggloor, thank you for answering my questions! It totally makes sense. Closing this issue.

ggloor / ALDEx_bioc

aldex.ttest for paired testing #35