statsmodels / statsmodels

Statsmodels: statistical modeling and econometrics in Python
http://www.statsmodels.org/devel/
BSD 3-Clause "New" or "Revised" License
9.99k stars 2.87k forks source link

can you pls add LSD-t, SNK-q and Dunnett-t test functions in MultiComparison class? #8168

Open shixian77 opened 2 years ago

shixian77 commented 2 years ago

or just give me some suggestions to realize it. thanks a lot

josef-pkt commented 2 years ago

I didn't plan to add LSD, because it doesn't preserve any FWER, or multiplicity correction, AFAIR. And it's better to use other methods. I don't remember anything about SNK, nor about it's properties

Dunnet multiple comparison against a control is on the wishlist but waiting for p-value tables or approximating funtions.

852

shixian77 commented 2 years ago

l get it, thanks reply

josef-pkt commented 2 years ago

https://www.graphpad.com/guides/prism/latest/statistics/stat_fishers_lsd.htm quote "While the protected Fisher's LSD test is of historical interest as the first multiple comparisons test ever developed, it is no longer recommended. It pretends to correct for multiple comparisons, but doesn't do so very well."

https://pubmed.ncbi.nlm.nih.gov/17128424/ it only controls error if number of groups is exactly 3 and power of protected LSD doesn't look good either.

So LSD stays out of statsmodels.

aside: https://cran.r-project.org/web/packages/agricolae/agricolae.pdf R package that has LSD test with standard p-value correction

i.e. LSD is just a variant of a t-test and we correct p-values for multiplicity as in standard pairwise t-tests

also from the above graphpad docs

quote: "The only difference between a set of t tests and the Fisher's LSD test, is that t tests compute the pooled SD from only the two groups being compared, while the Fisher's LSD test computes the pooled SD from all the groups (which gains power but depends on the assumption that all groups are sampled from populations with the same SD)."

this sounds like it's just a difference in whether variance estimate uses pooling across pairs or not. (I looked at this briefly a long time ago, because our sandbox MultiComparison.allpairtest does not allow pooling over pairs.)

josef-pkt commented 2 years ago

SNK sounds ok, it's a sequential tukey range tests However, graphpad docs also recommend against using it https://www.graphpad.com/support/faq/why-we-recommend-you-do-not-use-the-newman-keuls-multiple-comparison-test/

It looks like using Holm-Sidak is better, HS protects FWER and has more power than SNK

shixian77 commented 2 years ago

SNK sounds ok, it's a sequential tukey range tests However, graphpad docs also recommend against using it https://www.graphpad.com/support/faq/why-we-recommend-you-do-not-use-the-newman-keuls-multiple-comparison-test/

It looks like using Holm-Sidak is better, HS protects FWER and has more power than SNK

I know they are insufficient, but I just need to implement it because my teacher. Anyway, thanks for your time!