p-adj values in pairwise_tukeyhsd and MultiComparison bounded by 0.001 and 0.9

TalWac commented 2 years ago

Describe the bug

Dear developer,

when I'm running pairwise_tukeyhsd or MultiComparison.tukeyhsd I see that the values in p-adj are bounded from below by 0.001 and from about by 0.9. I see that when I compare to R test TukeyHSD the values of the p-adj can be higher than 0.9 and smaller then 0.001.

This is a problem for me since I have many other of the column M# (i.e, M1, M2, ..., M900) and I'll want to adjust with FDR (False discovery rate )for these values afterward. To do so, values that are much smaller the 0.001 are rounded to 0.001 and will not be significant after FDR.

Code Sample python:


# the data 
dat = {'Class_0': ['EKVX', 'EKVX', 'EKVX', 'EKVX', 'EKVX', 'EKVX', 'HOP62', 'HOP62',
                   'HOP62', 'HOP62', 'HOP62', 'HOP62', 'HOP92', 'HOP92', 'HOP92',
                   'HOP92', 'HOP92', 'HOP92', 'MPLT4', 'MPLT4', 'MPLT4', 'MPLT4',
                   'MPLT4', 'MPLT4', 'RPMI8226', 'RPMI8226', 'RPMI8226', 'RPMI8226',
                   'RPMI8226', 'RPMI8226'],
        'M2': [29.0660572, 29.4701607, 29.92649292, 29.62876917, 31.09229301,
              30.73131457, 29.54510531, 30.37553099, 29.22060366, 30.00387611,
              30.2893471, 30.42107896, 30.36357138, 30.04577991, 30.41139568,
              30.18903135, 30.23977308, 30.38604485, 28.45853761, 28.34003309,
              28.49059352, 28.65866869, 28.73640364, 28.76752952, 26.45224629,
              26.17478715, 26.77912804, 26.55138747, 26.69202968, 26.58586463]}

dat = pd.DataFrame(data=dat)

tueky_out = pairwise_tukeyhsd(endog=dat['M2'] , groups=dat['Class_0'], alpha=0.05)
print(tueky_out)
 Multiple Comparison of Means - Tukey HSD, FWER=0.05  
======================================================
group1  group2  meandiff p-adj   lower   upper  reject
------------------------------------------------------
  EKVX    HOP62  -0.0099    0.9 -0.7456  0.7257  False
  EKVX    HOP92   0.2868 0.7575 -0.4489  1.0224  False
  EKVX    MPLT4  -1.4106  0.001 -2.1462 -0.6749   True
  EKVX RPMI8226  -3.4466  0.001 -4.1823 -2.7109   True
 HOP62    HOP92   0.2967 0.7359  -0.439  1.0323  False
 HOP62    MPLT4  -1.4006  0.001 -2.1363  -0.665   True
 HOP62 RPMI8226  -3.4367  0.001 -4.1723  -2.701   True
 HOP92    MPLT4  -1.6973  0.001  -2.433 -0.9616   True
 HOP92 RPMI8226  -3.7334  0.001  -4.469 -2.9977   True
 MPLT4 RPMI8226  -2.0361  0.001 -2.7717 -1.3004   True
------------------------------------------------------

Code Sample R for the same data:

aov.out <-lm(M2 ~ Class_0 ,data =dat )
TukeyHSD.out <-TukeyHSD(aov(aov.out))
> TukeyHSD.out
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = aov.out)

$Class_0
                       diff        lwr        upr     p adj
HOP62-EKVX     -0.009924239 -0.7455454  0.7256969 0.9999994
HOP92-EKVX      0.286751449 -0.4488697  1.0223726 0.7814832
MPLT4-EKVX     -1.410553581 -2.1461747 -0.6749324 0.0000670
RPMI8226-EKVX  -3.446607382 -4.1822285 -2.7109862 0.0000000
HOP92-HOP62     0.296675687 -0.4389455  1.0322968 0.7599709
MPLT4-HOP62    -1.400629342 -2.1362505 -0.6650082 0.0000741
RPMI8226-HOP62 -3.436683143 -4.1723043 -2.7010620 0.0000000
MPLT4-HOP92    -1.697305029 -2.4329262 -0.9616839 0.0000039
RPMI8226-HOP92 -3.733358831 -4.4689800 -2.9977377 0.0000000
RPMI8226-MPLT4 -2.036053801 -2.7716750 -1.3004326 0.0000002

I have run these for different values of the Columns 'M#' (i.e, M1, ...,M900)

There is a difference in the values in the adj-pvalue but they are somewhat close between R and python, however, in statsmodels they are always bounded by 0.001 and 0.9.

is it possible to change it?

Kindly help,

Tal

josef-pkt commented 2 years ago

AFAIK, that's a limitation about how our p-values are implemented in qsturng. The latest version of scipy has a better implementation of the tukey-hsd distribution. This might improve if you have both scipy ~~and statsmodels~~ at the latest version, but I have not tried yet. correction the change to optionally delegate p-values to scipy is in main but not yet in a release AFAICS pull request #8035

However:

I'll want to adjust with FDR (False discovery rate )for these values afterward. To do so, values that are much smaller the 0.001 are rounded to 0.001 and will not be significant after FDR.

tukey-hsd already corrects for multiple testing FWER. Why are you still using FDR p-value correction after that?

TalWac commented 2 years ago

tukey-hsd already corrects for multiple testing FWER. Why are you still using FDR p-value correction after that?

You are right that it corrects for multiple testing. but only in a specific column in my example above it is called M2 (molecule #2). Since I have ~900 columns (molecules M1, M2, M3,.... M900), I'll need to correct for the number test I do, not only the number of pairwise comparisons.

As for https://github.com/statsmodels/statsmodels/pull/8035 not sure I understood how to combine scipy and statsmodels to have correct p-values

josef-pkt commented 2 years ago

how to combine scipy and statsmodels to have correct p-values

you need to have the latest scipy release and the development version of statsmodels.

I found nightly version, but never tried those out https://anaconda.org/scipy-wheels-nightly/statsmodels

otherwise statsmodels main needs to be installed from github

I'm not yet on the latest scipy, but scipy stats has now also tukey-hsd, which you should be able to use https://docs.scipy.org/doc/scipy-1.8.0/html-scipyorg/reference/generated/scipy.stats.tukey_hsd.html

TalWac commented 2 years ago

As for :

I found nightly version, but never tried those out https://anaconda.org/scipy-wheels-nightly/statsmodels otherwise statsmodels main needs to be installed from github

It was a bit confusing avouand did not try.

But the last option:

I'm not yet on the latest scipy, but scipy stats has now also tukey-hsd, which you should be able to use https://docs.scipy.org/doc/scipy-1.8.0/html-scipyorg/reference/generated/scipy.stats.tukey_hsd.html

Used the link above and p-values are much more accurate and more similar to those in the R package.

Thank you very much!!

TalWac commented 2 years ago

However, the values in the lower and upper confidence interval have opposite values in tukey_hsd in scipy.stats comparing to R and pairwise_tukeyhsd in statsmodels.stats.multicomp

from scipy.stats import tukey_hsd

dat_EKVX = dat.loc[dat['Class_0']=='EKVX',  'M2']
dat_HOP62 = dat.loc[dat['Class_0']=='HOP62',  'M2']
dat_HOP92 = dat.loc[dat['Class_0']=='HOP92',  'M2']
dat_MPLT4 = dat.loc[dat['Class_0']=='MPLT4',  'M2']
dat_RPMI8226 = dat.loc[dat['Class_0']=='RPMI8226',  'M2']

res = tukey_hsd(dat_EKVX, dat_HOP62, dat_HOP92,dat_MPLT4,dat_RPMI8226)

print(res)
Tukey's HSD Pairwise Group Comparisons (95.0% Confidence Interval)
Comparison  Statistic  p-value  Lower CI  Upper CI
 (0 - 1)      0.010     1.000    -0.726     0.746
 (0 - 2)     -0.287     0.781    -1.022     0.449
 (0 - 3)      1.411     0.000     0.675     2.146
 (0 - 4)      3.447     0.000     2.711     4.182
 (1 - 0)     -0.010     1.000    -0.746     0.726
 (1 - 2)     -0.297     0.760    -1.032     0.439
 (1 - 3)      1.401     0.000     0.665     2.136
 (1 - 4)      3.437     0.000     2.701     4.172
 (2 - 0)      0.287     0.781    -0.449     1.022
 (2 - 1)      0.297     0.760    -0.439     1.032
 (2 - 3)      1.697     0.000     0.962     2.433
 (2 - 4)      3.733     0.000     2.998     4.469
 (3 - 0)     -1.411     0.000    -2.146    -0.675
 (3 - 1)     -1.401     0.000    -2.136    -0.665
 (3 - 2)     -1.697     0.000    -2.433    -0.962
 (3 - 4)      2.036     0.000     1.300     2.772
 (4 - 0)     -3.447     0.000    -4.182    -2.711
 (4 - 1)     -3.437     0.000    -4.172    -2.701
 (4 - 2)     -3.733     0.000    -4.469    -2.998
 (4 - 3)     -2.036     0.000    -2.772    -1.300

josef-pkt commented 2 years ago

what do you mean with opposite values?

pair differences can go either way y0 - y1 or y1 - y0 your scipy results have both instead of just lower or upper triangle of all pairs

statsmodels / statsmodels