smartdata-analysis-and-statistics / precmed

A doubly robust precision medicine approach to estimate and validate conditional average treatment effects
https://smartdata-analysis-and-statistics.github.io/precmed/
Apache License 2.0
4 stars 0 forks source link

Dictionary of arguments #14

Closed NightlordTW closed 2 years ago

NightlordTW commented 2 years ago

Please make a dictionary of arguments used in the public functions (to make sure it is consistent). No need to include functions that are not intented to be used directly by exrenal users (and are not expored in the NAMESPACE)

jcaldasmagalhaes commented 2 years ago

####### export(abc) abc <- function(x)

####### export(cv) cv <- function(response, cate.model, ps.model, data, score.method, # Mandatory arguments (count & survival & continuous) init.model = NULL, # This is used in continuous data, when contrast/two regression is used ipcw.model = NULL, followup.time = NULL, tau0 = NULL, # Non-mandatory arguments survival only surv.min = 0.025, ipcw.method = "breslow", higher.y = TRUE, abc = TRUE, # Non-mandatory arguments survival & count & continuous (except xvar.smooth) prop.cutoff = seq(0.5, 1, length = 6), prop.multi = c(0, 1/3, 2/3, 1), ps.method = "glm", minPS = 0.01, maxPS = 0.99, train.prop = 3/4, cv.n = 10, error.max = 0.1, max.iter = 5000, initial.predictor.method = NULL, xvar.smooth.score = NULL, xvar.smooth.init = NULL, tree.depth = 2, n.trees.rf = 1000, n.trees.boosting = 200, B = 3, Kfold = 5, error.maxNR = 1e-3, max.iterNR = 150, tune = c(0.5, 2), seed = NULL, plot.gbmperf = TRUE, verbose = 2)

####### export(cvcount) cvcount <- function(cate.model, ps.model, data, score.method, higher.y = TRUE, abc = TRUE, prop.cutoff = seq(0.5, 1, length = 6), prop.multi = c(0, 1/3, 2/3, 1), ps.method = "glm", minPS = 0.01, maxPS = 0.99, train.prop = 3/4, cv.n = 10, error.max = 0.1, max.iter = 5000, initial.predictor.method = "boosting", xvar.smooth = NULL, tree.depth = 2, n.trees.boosting = 200, B = 3, Kfold = 5, error.maxNR = 1e-3, max.iterNR = 150, tune = c(0.5, 2), seed = NULL, plot.gbmperf = TRUE, verbose = 1, ...)

####### export(cvmean) cvmean <- function(cate.model, init.model = NULL, ps.model, data, score.method, higher.y = TRUE, abc = TRUE, prop.cutoff = seq(0.5, 1, length = 6), prop.multi = c(0, 1/3, 2/3, 1), ps.method = "glm", minPS = 0.01, maxPS = 0.99, train.prop = 3/4, cv.n = 10, error.max = 0.1, max.iter = 5000, initial.predictor.method = "boosting", xvar.smooth.score = NULL, xvar.smooth.init = NULL, tree.depth = 2, n.trees.rf = 1000, n.trees.boosting = 200, B = 3, Kfold = 6, plot.gbmperf = TRUE, error.maxNR = 1e-3, tune = c(0.5, 2), seed = NULL, verbose = 1, ...)

####### export(cvsurv) cvsurv <- function(cate.model, ps.model, data, score.method, ipcw.model = NULL, tau0 = NULL, followup.time = NULL, surv.min = 0.025, ipcw.method = "breslow", higher.y = TRUE, abc = TRUE, prop.cutoff = seq(0.5, 1, length = 6), prop.multi = c(0, 1/3, 2/3, 1), ps.method = "glm", minPS = 0.01, maxPS = 0.99, train.prop = 3/4, cv.n = 10, error.max = 0.1, max.iter = 5000, initial.predictor.method = "randomForest", tree.depth = 2, n.trees.rf = 1000, n.trees.boosting = 200, B = 3, Kfold = 5, error.maxNR = 1e-3, max.iterNR = 150, tune = c(0.5, 2), seed = NULL, plot.gbmperf = TRUE, verbose = 1)

####### export(dr.inference) dr.inference <- function(response, cate.model, ps.model, data, ipcw.model = NULL, followup.time = NULL, tau0 = NULL, surv.min = 0.025, ipcw.method = "breslow", ps.method = "glm", minPS = 0.01, maxPS = 0.99, interactions = TRUE, n.boot = 500, seed = NULL, verbose = 1, plot.boot = FALSE)

####### export(drcount.inference) drcount.inference <- function(cate.model, ps.model, data, ps.method = "glm", minPS = 0.01, maxPS = 0.99, interactions = TRUE, n.boot = 500, seed = NULL, verbose = 1, plot.boot = FALSE)

####### export(drmean.inference) drmean.inference <- function(cate.model, ps.model, data, ps.method = "glm", minPS = 0.01, maxPS = 0.99, interactions = TRUE, n.boot = 500, verbose = 1, plot.boot = FALSE, seed = NULL)

####### export(drsurv.inference) drsurv.inference <- function(cate.model, ps.model, data, ipcw.model = NULL, followup.time = NULL, tau0 = NULL, surv.min = 0.025, ipcw.method = "breslow", ps.method = "glm", minPS = 0.01, maxPS = 0.99, n.boot = 500, seed = NULL, verbose = 1, plot.boot = FALSE)

####### export(pm) pm <- function(response, cate.model, ps.model, data, score.method, init.model = NULL, # This is used in continuous data, when contrast/two regression is used ipcw.model = NULL, followup.time = NULL, tau0 = NULL, surv.min = 0.025, ipcw.method = "breslow", higher.y = TRUE, prop.cutoff = seq(0.5, 1, length = 6), ps.method = "glm", minPS = 0.01, maxPS = 0.99, initial.predictor.method = NULL, xvar.smooth.score = NULL, xvar.smooth.init = NULL, tree.depth = 2, n.trees.rf = 1000, n.trees.boosting = 200, B = 3, Kfold = 5, error.maxNR = 1e-3, max.iterNR = 150, tune = c(0.5, 2), seed = NULL, plot.gbmperf = TRUE, verbose = 1)

####### export(pmcount) pmcount <- function(cate.model, ps.model, data, score.method, higher.y = TRUE, prop.cutoff = seq(0.5, 1, length = 6), ps.method = "glm", minPS = 0.01, maxPS = 0.99, initial.predictor.method = "boosting", xvar.smooth = NULL, tree.depth = 2, n.trees.boosting = 200, B = 3, Kfold = 6, error.maxNR = 1e-3, max.iterNR = 150, tune = c(0.5, 2), seed = NULL, plot.gbmperf = FALSE, ...)

####### export(pmmean) pmmean <- function(cate.model, init.model, ps.model, data, score.method, higher.y = TRUE, prop.cutoff = seq(0.5, 1, length = 6), ps.method = "glm", minPS = 0.01, maxPS = 0.99, initial.predictor.method = "boosting", xvar.smooth.score = NULL, xvar.smooth.init = NULL, tree.depth = 2, n.trees.rf = 1000, n.trees.boosting = 200, B = 3, Kfold = 6, plot.gbmperf = FALSE, error.maxNR = 1e-3, tune = c(0.5, 2),seed = NULL, ...)

####### export(pmsurv) pmsurv <- function(cate.model, ps.model, score.method, data, ipcw.model = NULL, followup.time = NULL, tau0 = NULL, surv.min = 0.025, ipcw.method = "breslow", higher.y = TRUE, prop.cutoff = seq(0.5, 1, length = 6), ps.method = "glm", minPS = 0.01, maxPS = 0.99, initial.predictor.method = "randomForest", tree.depth = 2, n.trees.rf = 1000, n.trees.boosting = 200, B = 3, Kfold = 5, plot.gbmperf = TRUE, error.maxNR = 1e-3, max.iterNR = 100, tune = c(0.5, 2), seed = NULL)

jcaldasmagalhaes commented 2 years ago

issue14.xlsx

(see attached excel file) => I have put all arguments of each function into a column on the excel file so its easier to see if the order of the arguments is consistent. I colored the first few arguments for easier visualization, and we can already see some of them have an inconsistent order, for example

init.model order is not consistent init.model sometimes is NULL by default other times does not have default value score.method order is not consistent (pm surv swap order with data) init.model order is not consistent init.model sometimes is NULL by default other times does not have default value score.method order is not consistent (pm surv swap order with data) followup.time and tao0 sometimes one comes first sometimes the other etc

jcaldasmagalhaes commented 2 years ago

reminder for myself, dont touch any of the continuous outcomes functions (those with mean)

jcaldasmagalhaes commented 2 years ago

In the meanwhile we have renamed several functions ( issue #22 ) so I am copying here the overview of the name change as it will be useful for fixing the order of the arguments

overview of all changed names

cv => catecv (wrapper function) cvcount => catecvcount cvsurv => catecvsurv cvmean => catecvmean

pm => catefit (wrapper function) pmcount => catefitcount pmsurv => catefitsurv pmmean => catefitmean

dr.inference => atefit (wrapper function) drcount.inference => atefitcount drsurv.inference => atefitsurv drmean.inference => atefitmean

jcaldasmagalhaes commented 2 years ago

issue14_overviewOrderArguments.xlsb.xlsx The arguments order have been updated according to the yellow part of the excel file. The parts that are not in yellow have not been updated as they had lower importance. I put some things i noticed that i am not sure if they are issues or not here:

jcaldasmagalhaes commented 2 years ago

@NightlordTW let me know if you want to fix the order of the arguments in the function calls (in the examples). We have quite a few function calls so I suspect this is going to be taking sometime. And it is not really necessary since all function calls use the parameter name.

jcaldasmagalhaes commented 2 years ago

we keep this as it is