Closed AngelCampos closed 4 years ago
Hi,
I guess you're right about the source of the error. You probably just have to use !!
to unqoute iVar
. Here's an example that should work:
library(cutpointr)
varlist <- c("dsi", "age")
lapply(varlist, function(var) {
cutpointr(data = suicide, x = !!var, class = suicide)
})
#> Assuming the positive class is yes
#> Assuming the positive class has higher x values
#> Assuming the positive class is no
#> Assuming the positive class has higher x values
#> [[1]]
#> # A tibble: 1 x 16
#> direction optimal_cutpoint method sum_sens_spec acc sensitivity
#> <chr> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 >= 2 maximize_metric 1.75179 0.864662 0.888889
#> specificity AUC pos_class neg_class prevalence outcome predictor
#> <dbl> <dbl> <fct> <fct> <dbl> <chr> <chr>
#> 1 0.862903 0.923779 yes no 0.0676692 suicide dsi
#> data roc_curve boot
#> <list> <list> <lgl>
#> 1 <tibble [532 x 2]> <tibble [13 x 10]> NA
#>
#> [[2]]
#> # A tibble: 1 x 16
#> direction optimal_cutpoint method sum_sens_spec acc sensitivity
#> <chr> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 >= 56 maximize_metric 1.11537 0.199248 0.143145
#> specificity AUC pos_class neg_class prevalence outcome predictor
#> <dbl> <dbl> <fct> <fct> <dbl> <chr> <chr>
#> 1 0.972222 0.525678 no yes 0.932331 suicide age
#> data roc_curve boot
#> <list> <list> <lgl>
#> 1 <tibble [532 x 2]> <tibble [61 x 10]> NA
Created on 2020-10-27 by the reprex package (v0.3.0)
By the way, maybe multi_cutpointr
can be an alternative here. varList
are just columns from tmpData
, right? Then, if you would rather have a data.frame instead of a list, you can do
library(cutpointr)
multi_cutpointr(suicide, x = c("age", "dsi"), class = suicide,
pos_class = "yes")
#> age:
#> Assuming the positive class has lower x values
#> dsi:
#> Assuming the positive class has higher x values
#> # A tibble: 2 x 16
#> direction optimal_cutpoint method sum_sens_spec acc sensitivity
#> <chr> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 <= 55 maximize_metric 1.11537 0.199248 0.972222
#> 2 >= 2 maximize_metric 1.75179 0.864662 0.888889
#> specificity AUC pos_class neg_class prevalence outcome predictor
#> <dbl> <dbl> <chr> <fct> <dbl> <chr> <chr>
#> 1 0.143145 0.525678 yes no 0.0676692 suicide age
#> 2 0.862903 0.923779 yes no 0.0676692 suicide dsi
#> data roc_curve boot
#> <list> <list> <lgl>
#> 1 <tibble [532 x 2]> <tibble [61 x 10]> NA
#> 2 <tibble [532 x 2]> <tibble [13 x 10]> NA
Created on 2020-10-27 by the reprex package (v0.3.0)
Yes, I tried multi_cutpointr() first, but the problem I encountered is that when using the multiple-variables models created with multi_cupointr() to predict with new data using the predict() function I would get an "error: C stack usage 19923892 is too close to the limit". I even tried to subset the new data, as suggested in similar problems with this kind of error, and that didn't work either.
In the end, I circumvented creating new data frames and changing the $predictor variable name, and so on until it worked. Didn't know about "!!" to unquote, sadly. I will try it later on, it would be a more succinct solution.
Is the "C stack usage XXXXX is to close to the limit" error something your users have experienced before? Or never heard of it? I would try to document it if encounter it again. I would try with some in-built data to see if I can reproduce the error, just not today 😜 .
Thanks
Just to close the issue.
Yes, the problem was solved using !!
.
Any reference you could point me to, to better understand the behavior? I have never used this operator before.
That "C stack usage" error is a bit weird. Predicting with multi_cutpointr
objects is simply not supported (and probably also won't be supported in the future) and should throw the error no applicable method for 'predict' applied to an object of class "c('multi_cutpointr', 'tbl_df', 'tbl', 'data.frame')"
. Maybe we can print a more helpful error message there, so thanks for the pointer.
That !!
operator ("bang bang") is a very standard way of unquoting variables in functions that use tidy evaluation. I think the plan was or is to replace it by {{
("curly-curly"), but both will work in the above example. There's for example a blog post on that (https://www.brodrigues.co/blog/2019-06-20-tidy_eval_saga/), a maybe already superseded vignette on programming with dplyr where !!
is mentioned (http://rstudio-pubs-static.s3.amazonaws.com/328769_e8a0152e155b4163b4a54473adcea229.html) and a more technical explanation in Advanced R (https://adv-r.hadley.nz/quasiquotation.html).
Anyway, glad the function works now.
Thanks for the references. Best
I'm having many issues when trying to create cutpointr models through lapply().
This is an example that I find easy to understand why I think it should work but is not.
In this case, when I try to pass a character vector to the argument
x
ofcutpoinitr()
using an lapply() it is not recognizing the existence of such an object.My way around it is to input predictions and class as vectors instead of in a data.frame, but would prefer to do it the way I suppose it is intended. Is there something that could be done?, maybe something that has to do with tidy evaluation
eval_tidy()
?Best regards