egenn / rtemis

Advanced Machine Learning and Visualization
https://rtemis.org
GNU General Public License v3.0
137 stars 19 forks source link

Lots of missing no visible global function/no visible binding for global variable #53

Closed HenrikBengtsson closed 9 months ago

HenrikBengtsson commented 9 months ago

R CMD check --as-cran reports:

❯ checking R code for possible problems ... [34s/34s] NOTE
  binmat2vec: no visible global function definition for ‘.’
  dplot3_addtree: no visible binding for global variable ‘plt’
  dplot3_box: no visible binding for global variable ‘ID’
  dplot3_box: no visible global function definition for ‘.’
  dplot3_box: no visible binding for global variable ‘timeperiod’
  dt_describe: no visible binding for global variable ‘..index_nm’
  dt_describe: no visible binding for global variable ‘..index_cf’
  dt_describe: no visible binding for global variable ‘..index_dt’
  dt_get_duplicates: no visible binding for global variable ‘..on’
  dt_get_factor_levels: no visible binding for global variable
    ‘..factor_index’
  glm2table: no visible binding for global variable ‘..i’
  gplot3_map: no visible binding for global variable ‘x’
  gplot3_map: no visible binding for global variable ‘y’
  gplot3_map: no visible binding for global variable ‘group’
  gplot3_map: no visible binding for global variable ‘county’
  gplot3_map: no visible binding for global variable ‘abbr’
  likelihoodMediboostChooseFeat: no visible binding for global variable
    ‘rpart.params’
  matchCasesByRules: no visible binding for global variable ‘ID’
  mplot3_conf: no visible binding for global variable ‘autolabel’
  mplot3_heatmap: no visible binding for global variable ‘autolabel’
  mplot3_laterality: no visible binding for global variable ‘..index’
  mplot3_mosaic: no visible binding for global variable ‘autolabel’
  mplot3_varimp: no visible binding for global variable ‘autolabel’
  mplot_AGGTEobj: no visible binding for global variable ‘font.family’
  plotly_shade: no visible binding for global variable ‘scatter.type’
  preprocess_: no visible binding for global variable ‘..exclude’
  s_HAL: no visible binding for global variable ‘which.cv.lambda’
  s_LMTree: no visible binding for global variable ‘varimp’
  s_LightRuleFit: no visible binding for global variable ‘Empirical_Risk’
  s_LightRuleFit: no visible binding for global variable ‘Coefficient’
  s_PolyMARS: no visible binding for global variable ‘s_POLYMARS’
  s_RuleFit: no visible binding for global variable ‘s_RULEFIT’
  splitlin_: no visible binding for global variable ‘rtOrange’
  splitlineRC: no visible binding for global variable ‘rho.def’
  summarize.data.table: no visible global function definition for ‘.’
  varSelect: no visible binding for global variable ‘s_XGBLIN’
  Undefined global functions or variables:
    . ..exclude ..factor_index ..i ..index ..index_cf ..index_dt
    ..index_nm ..on Coefficient Empirical_Risk ID abbr autolabel county
    font.family group plt rho.def rpart.params rtOrange s_POLYMARS
    s_RULEFIT s_XGBLIN scatter.type timeperiod varimp which.cv.lambda x y

Some of them might be bugs, i.e. non-existing functions or objects. Others might be used in NSE code. For the latter, I use dummy assignments to NULL at the top of the function, e.g.

foo <- function(x) {
  ## To please R CMD check
  abc <- def <- NULL

  my_nse(x, abc & def)
}

Others use:

utils::globalVariables(c("abc", "def"))

foo <- function(x) {
  my_nse(x, abc & def)
}

but I think that's too blunt and error-prone.

egenn commented 9 months ago

I fixed some and removed some. The remaining are legitimate data.table syntax. Do we really need to use dummy assignments every time we use data.table in order to please R CMD check?

* checking R code for possible problems ... [24s/24s] NOTE
dplot3_box: no visible binding for global variable ‘ID’
dplot3_box: no visible binding for global variable ‘timeperiod’
dt_describe: no visible binding for global variable ‘..index_nm’
dt_describe: no visible binding for global variable ‘..index_cf’
dt_describe: no visible binding for global variable ‘..index_dt’
dt_get_duplicates: no visible binding for global variable ‘..on’
dt_get_factor_levels: no visible binding for global variable
  ‘..factor_index’
glm2table: no visible binding for global variable ‘..i’
matchCasesByRules: no visible binding for global variable ‘ID’
mplot3_laterality: no visible binding for global variable ‘..index’
preprocess_: no visible binding for global variable ‘..exclude’
s_LightRuleFit: no visible binding for global variable ‘Empirical_Risk’
s_LightRuleFit: no visible binding for global variable ‘Coefficient’
Undefined global functions or variables:
  ..exclude ..factor_index ..i ..index ..index_cf ..index_dt ..index_nm
  ..on Coefficient Empirical_Risk ID timeperiod
egenn commented 9 months ago

I set everything to NULL as suggested and the NOTE is gone.

HenrikBengtsson commented 9 months ago

Do we really need to use dummy assignments every time we use data.table in order to please R CMD check?

I do this (set dummy variables to NULL inside functions for NSE variables), because then when R CMD check --as-cran complains in the future, I know it's most likely a true positive. If I'd used utils::globalVariables(), there's a risk that there will be false negatives, especially if your code base is huge.

I write "To please R CMD check", but I'm really grateful for these checks, because they definitely have spotted bugs in code, both in my code and others.

It's awesome that the package now passes the static-code inspection/validation.