mayoverse / arsenal

An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
https://mayoverse.github.io/arsenal/
223 stars 14 forks source link

Tableby: Perform weighted versions of statistical tests when `weights` argument is used #223

Open scheidec opened 5 years ago

scheidec commented 5 years ago

Are there any current/future plans to incorporate weighted versions of the statistical tests into this function, without having to use modpval.tableby()? Any thought to using functions from the survey package such as svyttest, svychisq, and svyranktest to add this functionality if the weights argument is specified?

eheinzen commented 5 years ago

An excellent idea! I'm going to be releasing v3.2.0 in the next few days, but I'll try to incorporate this into the release after that!

scheidec commented 5 years ago

Awesome, I'd be glad to contribute in any way I can!

scheidec commented 5 years ago

Here's some starter code that could return the same information that is computed in the tableby.stat.tests.R functions:

# Weighted ANOVA/t-test

  if (!is.null(weights)) {

    test <- regTermTest(svyglm(x ~ x.by, design = svydesign(ids=~1, weights=~weights, data = data)),
                        test.terms = x.by,
                        method = "Wald")

    return(list(p.value     = as.numeric(test["p"]),
                statistic.F = as.numeric(test["Ftest"]),
                method      = "Weighted Linear Model ANOVA"))
  }

# Weighted Kruskal-Wallis/Wilcoxon Test

  if (!is.null(weights)) {

    test <- svyranktest(x ~ x.by, design = svydesign(ids=~1, weights=~weights, data = data))

    return(list(p.value     = as.numeric(test["p.value"]),
                statistic.F = as.numeric(test["parameter"]),
                method      = "Weighted Kruskal-Wallis rank sum test"))
  }

# Weighted Chi-Square Test

  if (!is.null(weights)) {

    test <- svychisq(~ x.by + x, design = svydesign(ids=~1, weights=~weights, data = data))

    return(list(p.value     = as.numeric(test["p.value"]),
                statistic.F = as.numeric(test["statistic"]),
                method      = "Weighted Pearson's Chi-squared test (Rao & Scott adjustment)"))
  }

Would this make sense integrated into the existing functions in that script, in different functions or an entirely new tableby.stat.tests.weighted.R script?

I'm not aware of weighted versions of Fisher Exact test, trend test for ordinal data or for the log rank test for survival data, so maybe this would only apply for the above tests, and all others would return output similar to the "notest" option.

bethatkinson commented 5 years ago

One option for the logrank is to use the weighted Cox model as an approximation to a true logrank test. Usually the results are quite close.

From: Caleb Scheidel [mailto:notifications@github.com] Sent: Thursday, June 13, 2019 3:41 PM To: eheinzen/arsenal Cc: Subscribed Subject: [EXTERNAL] Re: [eheinzen/arsenal] Tableby: Perform weighted versions of statistical tests when weights argument is used (#223)

Here's some starter code that could return the same information that is computed in the tableby.stat.tests.R functions:

Weighted ANOVA/t-test

if (!is.null(weights)) {

test <- regTermTest(svyglm(x ~ x.by, design = svydesign(ids=~1, weights=~weights, data = data)),

                    test.terms = x.by,

                    method = "Wald")

return(list(p.value     = as.numeric(test["p"]),

            statistic.F = as.numeric(test["Ftest"]),

            method      = "Weighted Linear Model ANOVA"))

}

Weighted Kruskal-Wallis/Wilcoxon Test

if (!is.null(weights)) {

test <- svyranktest(x ~ x.by, design = svydesign(ids=~1, weights=~weights, data = data))

return(list(p.value     = as.numeric(test["p.value"]),

            statistic.F = as.numeric(test["parameter"]),

            method      = "Weighted Kruskal-Wallis rank sum test"))

}

Weighted Chi-Square Test

if (!is.null(weights)) {

test <- svychisq(~ x.by + x, design = svydesign(ids=~1, weights=~weights, data = data))

return(list(p.value     = as.numeric(test["p.value"]),

            statistic.F = as.numeric(test["statistic"]),

            method      = "Weighted Pearson's Chi-squared test (Rao & Scott adjustment)"))

}

Would this make sense integrated into the existing functions in that script, in different functions or an entirely new tableby.stat.tests.weighted.R script?

I'm not aware of weighted versions of Fisher Exact test, trend test for ordinal data or for the log rank test for survival data, so maybe this would only apply for the above tests, and all others would return output similar to the "notest" option.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/eheinzen/arsenal/issues/223?email_source=notifications&email_token=ACWQG56MM6JRIYO4QUEWDNTP2KWFVA5CNFSM4HXNBEU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXU7A3Q#issuecomment-501870702, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACWQG53TPCMQX752KBE4S33P2KWFVANCNFSM4HXNBEUQ.

natachadata commented 4 years ago

Hello, I was reading your posts as I'm looking for the weighted versions of Fisher Exact test. Do you know what R package can be used in order to apply an exact test to a weighted dataset? Thanks for your help.

eheinzen commented 4 years ago

Sorry to take so long to get back to you about this. I'm asking around.

eheinzen commented 4 years ago

Update: we're not sure there is such a thing. Sorry about that!