New function for durability analysis

monicagerber commented 5 years ago

From recent McElrath 708 BAMA report:

dura_comparison_fun = function(study_in){
  study_groups = as.character(unique(study_in$group_plot))
  if(any(is.na(study_groups))) stop("there is a NA group")
  levels_here <- levels(study_in$group_plot)
   map_df(head(study_groups, -1), function(i){
    index_i = which(study_groups == i)
    map_df(study_groups[-(1:index_i)], function(j){
      test_data = subset(study_in, group_plot %in% c(i, j))

      if (dplyr::n_distinct(test_data$group_plot) != 2) stop("Something wrong with subsetting")

      comparison_here <- paste0(i, ' vs. ', j)

      test_data_summary = test_data %>% group_by(group_plot) %>%
        dplyr::summarize(
          peak_responses = n(),
          censored = sum(censored_28),
          uncensored = peak_responses - censored,
          median_right = paste0(round(median(log_fold_change),2), " [", round(min(log_fold_change),2), ", ", round(max(log_fold_change),2), "]"))

          p_val = ifelse((min(test_data_summary$peak_responses >= 3 & any(test_data_summary$censored <= 0.5 * test_data_summary$peak_responses) == TRUE)), interval::ictest(L = test_data$log_fold_change_left, R = test_data$log_fold_change_right, group = test_data$group_plot, alternative="two.sided", scores='wmw')$p.value, NA)

      data.frame(
        Comparison = comparison_here,
        group1 = i,
        group2 = j,
        peak_responses = paste0(test_data_summary$peak_responses,collapse = ' vs. '),
        censored_vals = paste0(test_data_summary$censored,collapse = ' vs. '),
        uncensored_vals = paste0(test_data_summary$uncensored,collapse = ' vs. '),
        median_vals = paste0(test_data_summary$median_right, collapse = " vs. "),
        p_val = p_val
      )
    })
  })
}

wfulp commented 3 years ago

@monicagerber @mayerbry @celiaeddy Do you think this method is stable enough to add to the next release?

monicagerber commented 3 years ago

Oh boy, It's been a while since I thought about this. I think you should move forward with the current features in develop and this should go into the next release. I don't like that it depends on certain variable names (e.g. censored_28). I think it needs a little editing.

celiaeddy commented 3 years ago

hi! not sure if you are asking about the method meaning the test with interval censoring. that, I think, is stable. but I agree with @monicagerber that the function needs some love. in addition to variable names there are conditions like the number of responders at peak and the proportion of censored input values that should be allowed to vary. I think it will take some time to perfect and shouldn't hold up the release, but I have been consistently using this function for BAMA and NAb work so it may be worth working on.

FredHutch / VISCfunctions

New function for durability analysis #2