Merck / gsDesign2

Group Sequential Design Under Non-Proportional Hazards
https://merck.github.io/gsDesign2/
GNU General Public License v3.0
19 stars 8 forks source link

Robust way to compare 2 functions #411

Closed LittleBeannie closed 4 months ago

LittleBeannie commented 4 months ago

Following discussion in https://github.com/Merck/gsDesign2/pull/408.

jdblischak commented 4 months ago

Specifically the goal is to make the check below more robust:

https://github.com/Merck/gsDesign2/blob/cb91f5d6e16317490c56b6b4a1c80602a4c0481f/R/gs_update_ahr.R#L233-L234

The function may be saved to a file (eg by the Shiny app) and loaded later, so it's environment would be different, and thus evaluate to FALSE. For example:

gs_b
## function(par = NULL, k = NULL, ...) {
##   if (is.null(k)) {
##     return(par)
##   } else {
##     return(par[k])
##   }
## }
## <environment: namespace:gsDesign2>
z <- function(par = NULL, k = NULL, ...) {
  if (is.null(k)) {
    return(par)
  } else {
    return(par[k])
  }
}
identical(z, gs_b)
## [1] FALSE

We could remove the environment by comparing args() and body(), but that would cause problems if we updated the body function (eg to make it faster or more readable) since the version of the function saved in a file would not have been updated.

In any case, body() only works well for very simple functions. For gs_b(), environments end up getting included.


# from ?body
f <- function(x) x^5
body(f)
## x^5
str(body(f)) # no environemnts
##  language x^5

body(gs_b)
## {
##   if (is.null(k)) {
##     return(par)
##   }
##   else {
##     return(par[k])
##   }
## }
str(body(gs_b)) # has envs embedded
## language {  if (is.null(k)) {; return(par); }; else {; return(par[k]); } }
## - attr(*, "srcref")=List of 2
## ..$ : 'srcref' int [1:8] 71 45 71 45 45 45 4368 4368
## .. ..- attr(*, "srcfile")=Classes 'srcfilealias', 'srcfile' <environment: 0x00000286507149f8> 
##   ..$ : 'srcref' int [1:8] 72 3 76 3 3 3 4369 4373
## .. ..- attr(*, "srcfile")=Classes 'srcfilealias', 'srcfile' <environment: 0x00000286507149f8> 
##   - attr(*, "srcfile")=Classes 'srcfilealias', 'srcfile' <environment: 0x00000286507149f8> 
##   - attr(*, "wholeSrcref")= 'srcref' int [1:8] 1 0 77 1 0 1 1 4374
## ..- attr(*, "srcfile")=Classes 'srcfilealias', 'srcfile' <environment: 0x00000286507149f8>

Though we could wrap it in capture.output(), which works, but only if the function is never updated in {gsDesign2}.

identical(args(z), args(gs_b)) &&
  identical(capture.output(body(z)), capture.output(body(gs_b)))
## [1] TRUE
jdblischak commented 4 months ago

Instead of comparing functions, I'd suggest comparing returned values of functions instead.

I agree with @yihui that comparing returned values would be the most robust solution.

A major problem with that is that the function interface is not uniform. For gs_b(), the argument par is a numeric vector, but for the example in the test file, par is a list.

args(gs_b)
## function (par = NULL, k = NULL, ...) 
## NULL

args(x$input$lower)
## function (k = 1, par = list(sf = gsDesign::sfLDOF, total_spend = 0.025, 
##     param = NULL, timing = NULL, max_info = NULL), hgm1 = NULL, 
##     theta = 0.1, info = 1:3, efficacy = TRUE, test_bound = TRUE, 
##     r = 18, tol = 1e-06) 
## NULL

gs_b(par = 4:2, k = 2)
## [1] 3

x$input$lower(par = 4:2, k = 2)
## Error in par$timing : $ operator is invalid for atomic vectors

Is there documentation somewhere of how to create a valid function for computing the lower bound? The documentation of the argument lower for gs_design_ahr() is minimal:

https://github.com/Merck/gsDesign2/blob/cb91f5d6e16317490c56b6b4a1c80602a4c0481f/R/gs_design_ahr.R#L32

However, the default value is gs_spending_bound:

https://github.com/Merck/gsDesign2/blob/cb91f5d6e16317490c56b6b4a1c80602a4c0481f/R/gs_design_ahr.R#L196

And it also has par as a list:

args(gs_spending_bound)
## function (k = 1, par = list(sf = gsDesign::sfLDOF, total_spend = 0.025, 
##     param = NULL, timing = NULL, max_info = NULL), hgm1 = NULL, 
##     theta = 0.1, info = 1:3, efficacy = TRUE, test_bound = TRUE, 
##     r = 18, tol = 1e-06) 
## NULL

So I assume that gs_b() is the outlier

LittleBeannie commented 4 months ago

Is it a good idea if we save around 5-10 minutes to discuss it this Friday?

jdblischak commented 4 months ago

Is it a good idea if we save around 5-10 minutes to discuss it this Friday?

Yes, I think it would be a good discussion topic

jdblischak commented 4 months ago

I didn't realize this was such a pervasive pattern throughout the package, eg

https://github.com/Merck/gsDesign2/blob/4a97d17521636631cf8d47192ea608d2dbf958d5/R/to_integer.R#L295-L297

https://github.com/Merck/gsDesign2/blob/4a97d17521636631cf8d47192ea608d2dbf958d5/R/gs_power_combo.R#L110-L111

My fix in #413 was specific to comparing to gs_b().

I can image creating a function that is more flexible, eg is_same_func(f1, f2, input = list(<args>), that we could use to compare any two given functions. However, I'm hesitant to do this due to the reduction in code readability. How big of a concern is this really? How often will a user (or the Shiny app) pass a function that is effectively identical when it is much simpler to just use the default argument?

nanxstats commented 3 months ago

Yeah, it's quite prevalent. I can't quantify the size of the concern but the risk here is using a unreliable comparison mechanism that could return false negative results and generate wrong design results without the user knowing due to factors you can't control - such as reusing serialized objects from before. Would that be a huge issue? I don't know 🤔

I guess I see no better simple alternatives but using the attribute-based solution suggested by @yihui, a sketch:

add_identifier <- function(f, id) {
  attr(f, "id") <- id
  f
}

is_gs_b <- function(f) {
  id <- attr(f, "id")
  if (is.null(id)) {
    return(FALSE)
  } else {
    id == "gs_b"
  }
}

gs_b_id <- add_identifier(gsDesign2::gs_b, "gs_b")

is_gs_b(gsDesign2::gs_b)
is_gs_b(gs_b_id)
yihui commented 3 months ago

If we were to start from the beginning, I'd make upper and lower strings by default (instead of functions), which would be easier to store and more robust to compare. They can still accept function input, but the responsibility would be on the user end, i.e., they have to make sure they input an identical function each time. I'm not sure how common it would be for users to need to input custom functions.

Apparently, we can't start from the beginning now, and have to consider backward compatibility. My suggestion is that we change our default to strings (e.g., 'b', 'spending_bound', 'spending_combo'---feel free to use more meaningful and intuitive names). Then for those comparisons, we first check if the value is character and test for equality. If the value is function, we fall back to the current identical(., .) approach. I'm not sure if that makes sense.

jdblischak commented 3 months ago

I guess I see no better simple alternatives but using the attribute-based solution suggested

This still seems too complex to me. And I feel it only kicks the can down the road. What if a user passes their own custom function? If they don't add this id attribute, then the code still won't handle it properly.

My suggestion is that we change our default to strings

I am leaning towards this solution. I think it makes sense to directly pass functions when a user is free to create and pass their own custom functions. We use this feature to great effect for the cut and test functions in {simtrial}.

However, in {gsDesign2}, there is no possibility of providing a user-defined spending function and obtaining a reliable result. The code logic only updates the boundaries if gs_b or gs_spending_bound is provided, eg

https://github.com/Merck/gsDesign2/blob/4a97d17521636631cf8d47192ea608d2dbf958d5/R/to_integer.R#L312-L320

If we still want to support the potential of users providing their own functions instead of using gs_b() or gs_spending_bound(), I propose adding a string argument that specifies the type of boundary function. I don't understand the domain well enough to choose proper names, but I'm imaging arguments like:

  upper_type = "dynamic",
  upper = gs_spending_bound,
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha),
  lower_type = "dynamic",
  lower = gs_spending_bound,
  lpar = list(sf = gsDesign::sfLDOF, total_spend = beta),

And then the code logic would be:

 # Updated lpar 
 if (identical(x$input$lower_type, "static")) { 
   lpar_new <- x$input$lpar 
 } else if (identical(x$input$lower, "dynamic")) { 
   lpar_new <- x$input$lpar 
   if (!("timing" %in% names(x$input$lpar))) { 
     lpar_new$timing <- upar_new$timing 
   } 
 } else {
  stop('lower_type must be either "static" or "dynamic"')
}
yihui commented 3 months ago

I agree with @jdblischak. I don't know the domain enough, either, so domain experts will have to make the call.