Possible inconsistency in getSimulationSurvival() with specified thetaH0 and thetaH1

grlm1 commented 8 months ago

Hello, I am trying to understand thetaH1 in getSimulationSurvival. I am interested in an adaptive design involving a thetaH0 bound of 0.9. Note that I am interested in testing H0: HR >= 0.9 vs. H1: HR < 0.9, that means the 0.9 bound is not a non-inferiority bound in the classical sense. In the first scenario I ran a simulation without specifying thetaH1, which should be using the test-statistics of stage 1 and 2 and the observed hazard ratio of stage 2 to reassess the required events for stage 3. I picked out an iteration which had a hazard ratio of 0.4503625 at stage 2. Then I re-ran the simulation, using the same seed, with thetaH1 = 0.4503625 and looked at the same iteration again. Given I used the same seed, it was expected that the results including stage 2 test-statistic and observed hazard ratio are the same. However, this time, the reassessement, which in my understanding only requires the test-statistics of the previous stages (which here are the same between the two scenarios) and an assumed hazard ratio, which I also made sure to be the same given my specification of thetaH1, reported an amount of events for stage 3 which differs from the first scenario.

My code is below. Unless my understanding of the subject is incorrect, I do suspect some kind of inconsistency in regards the thetah0 bound between the two scenarios.

Kind regards Tim

library(dplyr)
library(rpact)
futilityBounds <- c(0, 0.688)
N <- 14000
informationRates <- c(0.4, 0.6, 1)

dIN <- getDesignInverseNormal(kMax = 3,
                              alpha = 0.025,
                              beta = 0.1,
                              sided = 1,
                              informationRates = informationRates,
                              futilityBounds = futilityBounds,
                              typeOfDesign = "asUser",
                              userAlphaSpending = c(0, 0.01, 0.025)
)

myCalcEventsFunction_with_division_by_thetaH0 <- function(...,
                                 stage, thetaH0, conditionalPower, estimatedTheta, 
                                 plannedEvents, eventsOverStages,
                                 minNumberOfEventsPerStage, maxNumberOfEventsPerStage, allocationRatioPlanned,
                                 conditionalCriticalValue) {

  theta <- max(1 + 1e-12, estimatedTheta)
  cat(paste0("conditional critical value: ", conditionalCriticalValue, "\n"))
  cat(paste0("estimated theta: ", estimatedTheta, "\n"))
  requiredStageEvents <-
    max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
  requiredStageEvents <- min(
    max(minNumberOfEventsPerStage[stage], requiredStageEvents),
    maxNumberOfEventsPerStage[stage]
  ) + eventsOverStages[stage - 1]
  return(requiredStageEvents)
}

#wrapper function for convenience, allows us to create the same simulation but it can only differ in thetaH1 and the calceventsfunction
sim_wrapper <- function(thetaH1 = NA_real_, calcEventsFunction = NULL) {
dIN_sim <- getSimulationSurvival(design = dIN,
                                 thetaH0 = 0.9,
                                 directionUpper = F,
                                 pi2 = 0.003,
                                 hazardRatio = c(0.4), 
                                 eventTime = 12,
                                 dropoutRate1 = 0.08,
                                 dropoutRate2 = 0.08,
                                 dropoutTime = 12,
                                 accrualTime = c(0,0.01),
                                 maxNumberOfSubjects = N,    
                                 plannedEvents =  c(20, 30, 50),
                                 maxNumberOfIterations = 75,
                                 longTimeSimulationAllowed = T,
                                 seed = 1910,
                                 showStatistics = T,
                                 maxNumberOfRawDatasetsPerStage = 2,
                                 minNumberOfEventsPerStage = c(20, 10, 20),
                                 maxNumberOfEventsPerStage = c(20, 10, 45), 
                                 conditionalPower = 0.9,
                                 calcEventsFunction = calcEventsFunction,
                                 thetaH1 = thetaH1
)}

#standard, use observed HR as thetaH1, interesting iteration 75 chosen
sim_wrapper(thetaH1 = NA_real_, calcEventsFunction = NULL) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#at stage 2, some outputs:
#estimated HR: 0.4503625
#test statistic: 1.896057
#-> for stage 3: increase single events from 20 to 41

#using the same seed, the data looks the same up to IA2. Now using the HR that we have just seen from iteration 75 in the 
#"use interim HR as theta case" (0.4503625) as a fixed thetaH1.
#-> if I understand the theory correctly, it should lead to the same SSR, as the test statistics of stage 1 and 2 are identical, 
#and the theta for the SSR is the same
sim_wrapper(thetaH1 = 0.4503625, calcEventsFunction = NULL) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#first of all consistency checks, estimated HR and test statistic of stage 2 are equal to the case above...
#estimated HR: 0.4503625
#test statistic: 1.896057
#BUT, the ssr now says we need 30 singular events for stage 3 instead of the 41 from above.

#now with custom function, translated from cpp code to the best of my ability, so it should be the same as rpact native if done correctly
#since we are using only 75 iterations, and we are interested in iteration 75, the last printed line in your console corresponds to the 
#conditional critical value and estimated theta of that iteration 75 (didn't know how to do that more neatly)
sim_wrapper(thetaH1 = NA_real_, calcEventsFunction = myCalcEventsFunction_with_division_by_thetaH0) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#conditional critical value: 0.924377935201586 (from console print...)
#estimated theta: 1.79855127135391 (from console print...)
#results look the same as the native rpact ssr function with 41 single events at stage 3
#but focus on estimatedTheta which gets passed internally
#estimated theta: 1.79855127135391
#while hazardRatioEstimateLR 0.4503625
#obviously there is a directionUpper mirroring
#but 1/1.79855127135391 != 0.4503625
#maybe estimatedTheta already the theta which would lead to the same test-statistic given thetaH0 = 0.9
#but 1/(0.4503625/0.9) !=  1.79855127135391
#however 1/(0.4503625/0.9) == 1.79855127135391/0.9

#now with a fixed theta, again, we should not have differences here for iteration 75 as the 
#chosen fixed theta is exactly the theta observed in stage 2 of iteration 75
sim_wrapper(thetaH1 = 0.4503625, calcEventsFunction = myCalcEventsFunction_with_division_by_thetaH0) %>% getData() %>% filter(iterationNumber == 75) %>% print()
#conditional critical value: 0.924377935201586 -> same as before, as expected
#estimated theta: 1.99839018568375 -> not the same
#this time this equation holds
#1/(0.4503625/0.9) == 1.99839018568375

#now I try this by hand, and on the negative side of the normal distribution first so i dont get confused by the directionUpper mirroring...
#max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
min(0, -0.924377935201586 - qnorm(0.9))^2 * (((1+1)^2)/1) / log(0.4503625/0.9)^2
#40.6071 -> I guess this would be consistent with the 41 from scenario 1 with thetaH1=NA_real_ and rpact native ssr function
#however, I plugged in the true HR of 0.4503625
#if I want to do this the directionUpper way (applying thetaH0 first, then mirroring), trivially the same as log(1) is 0.
max(0, 0.924377935201586 + qnorm(0.9))^2 * (((1+1)^2)/1) / log(1/(0.4503625/0.9))^2
#40.6071

#what happens in the fixed thetah1 case where it said we need 30 events for stage 3?
#estimatedTheta internally is 1.99839018568375, i.e. 1/(0.4503625/0.9)
#but then gets plugged in 
##max(0, conditionalCriticalValue + qnorm(conditionalPower))^2 * (((1+allocationRatioPlanned)^2)/allocationRatioPlanned) / log(theta/thetaH0)^2
#which divides it again by 0.9
#i.e.
max(0, 0.924377935201586 + qnorm(0.9))^2 * (((1+1)^2)/1) / log(((1/(0.4503625/0.9))/0.9))^2
#30.58873
#((1/(0.4503625/0.9))/0.9) = 2.220434 = 1/0.4503625

gwassmer commented 8 months ago

Hi Tim, Thank you for your precise investigation. We agree, it is an inconsistency that will be fixed for the next release. In the meantime, a possible solution would be to use the very nice wrapper function like this sim_wrapper(thetaH1 = 0.4503625 / 0.9, calcEventsFunction = NULL) %>% getData() %>% filter(iterationNumber == 75) %>% print() Always very impressive to receive your comments :-) Kind regards Gernot and Friedrich

fpahlke commented 7 months ago

Issue was fixed in branch dev/3.5.2

rpact-com / rpact

Possible inconsistency in getSimulationSurvival() with specified thetaH0 and thetaH1 #25