Open CrystalXuR opened 2 years ago
Let's wait a bit before adding tests and:
1) Can you see last point in #6 and make sure to use survival::
+ Imports: survival
2) I would want all the estimator files in R/ do having a runnable @example
https://github.com/som-shahlab/survmetabench/blob/master/r-package/survlearners/R/surv_xl_lasso.R#L11
that means that when opening a new PR, R runs all the code examples and we know it's not broken
that means you need to go through all files and:
replace \dontrun with \donttest
which will make R CMD check execute all those examples
Can you do this in separate PRs?
yes, I will do these two things in two separate PRs.
@erikcs Please let me know if you have other things want me to check, otherwise, I think you and @nignatiadis can start the review
There is some more cleanup to do:
1) Remove MASS and mvtnorm from Imports if they are no longer being used for dgps, that closes #6
2) All the old R-learner files have signature (x, w, y), that is a recipe for bugs when GRF uses (Covariates, Outcome, Treatment). They should all be updated to use signature (X, Y, W) with uppercase letters for consistency. that closes #3
3) The rest of the surv_ estimators has signature (data, data.test), that is not the signature a user should face, it should be (X, Y, W, etc). Making convenience entry functions for simulations using (data, data.test) is something you can do outside the r-package in a separate file for simulations. CSF.R should not be part of the R/ package There can be an experiments\ folder in the main repo which stiches together and sets up the simulations (HPC.R) by doing library survlearners, see GRF for an example
4) The predict.* signatures should be predict(object, newdata = NULL): NOT "newx" (follow predict.lm which use newdata)
5) Aesthetics:
a) All the files in R/ should have consistent signature. "variable.names" with punctuation and function_name with underscore. Everything besides these use .
, cen_fit is not consistent. Arguments that are similar to grf and passed on etc should also be named consistently, for example "sample.weights" instead of "weights", "Kaplan-Meier" instead of "KM". This includes everything in the R code too, y_fit is not consistent with S.hat, nor varName, use consistent . for variable.names.
b) go over every file to make sure coding style is consistent, use of white space is consistent, use of assignemnt operator <- is consistent. this file https://github.com/som-shahlab/survmetabench/blob/master/r-package/survlearners/R/utils.R is a mess which mixes double double tabs, =/<- , sometimes have space between =, sometimes not, sometimes have space between if(.., sometimes not, some files use single ' some files use ", needs to be fixed and make consistent througout. Look at GRF for example
6) Final package requirement: I will turn on pedantic mode in R CMD check here: https://github.com/som-shahlab/survmetabench/pull/10 which means you will have to make R CMD checks gives zero warnings, which means for example there can be no mistakes in @ docstrings. Ideally it will make NOTEs in R CMD check illegal too,
7) Final before Nikos/me goes over all the code logic in estimators: a) make sure no more major cleanup needed b) meet and make sure everyone agrees what the expectation of surv_* is. Should the non-survival Rlearners be made internal
Thank you Erik for the well-explained list, I will go through them and let you know when I am done!
Here are some suggestions for ~ invariance tests to put in for all estimators:
edit: added in #135
@erikcs I am done with cleaning and editing all the codes in the R folder, should I add and test them to the testthat\ before hand over to you and @nignatiadis? if so, what is the workflow for that process?