Open cicdguy opened 3 years ago
@hey59 would you have time to look into this today?
Provenance:
Creator: anajens
Thanks for bringing up this issue and thanks for the tag! I believe this is due to the different data being produced in the preprocessing step and the code =
in cdisc.data
Here's a modified version of your code that runs with no issue (although including 400 subjects makes the heatmap very busy)
library(random.cdisc.data)
ADSL <- radsl(cached = TRUE)
ADEX <- radex(cached = TRUE)
ADAE <- radae(cached = TRUE)
ADCM <- radcm(cached = TRUE)
add_visit <- function(data_need_visit){
visit_dates <- ADEX %>%
filter(PARAMCD == "DOSE") %>%
distinct(USUBJID, AVISIT, ASTDTM) %>%
group_by(USUBJID) %>%
arrange(ASTDTM) %>%
mutate(next_vis = lead(ASTDTM), is_last = ifelse(is.na(next_vis), TRUE, FALSE)) %>%
rename(this_vis = ASTDTM)
data_visit <- data_need_visit %>%
select(USUBJID, ASTDTM) %>%
left_join(visit_dates, by = "USUBJID") %>%
filter(ASTDTM > this_vis & (ASTDTM < next_vis | is_last == TRUE)) %>%
left_join(data_need_visit)
return(data_visit)
}
ADAE <- add_visit(ADAE) %>% distinct()
ADCM <- add_visit(ADCM) %>% distinct()
ADEX <- ADEX %>%
filter(PARCAT1 == "INDIVIDUAL") %>%
mutate(ongo_status = (EOSSTT == "ONGOING"))
app <- init(
data = cdisc_data(
cdisc_dataset("ADSL", ADSL),
cdisc_dataset("ADEX", ADEX),
cdisc_dataset("ADAE", ADAE),
cdisc_dataset("ADCM", ADCM),
code = "
ADSL <- radsl(cached = TRUE)
ADEX <- radex(cached = TRUE)
ADAE <- radae(cached = TRUE)
ADCM <- radcm(cached = TRUE)
ADEX <- ADEX %>%
filter(PARCAT1 == 'INDIVIDUAL') %>%
mutate(ongo_status = (EOSSTT == 'ONGOING'))
add_visit <- function(data_need_visit){
visit_dates <- ADEX %>%
filter(PARAMCD == 'DOSE') %>%
distinct(USUBJID, AVISIT, ASTDTM) %>%
group_by(USUBJID) %>%
arrange(ASTDTM) %>%
mutate(next_vis = lead(ASTDTM), is_last = ifelse(is.na(next_vis), TRUE, FALSE)) %>%
rename(this_vis = ASTDTM)
data_visit <- data_need_visit %>%
select(USUBJID, ASTDTM) %>%
left_join(visit_dates, by = 'USUBJID') %>%
filter(ASTDTM > this_vis & (ASTDTM < next_vis | is_last == TRUE)) %>%
left_join(data_need_visit)
return(data_visit)
}
ADAE <- add_visit(ADAE) %>% distinct()
ADCM <- add_visit(ADCM) %>% distinct()
",
check = TRUE
),
modules = root_modules(
tm_g_heat_bygrade(
label = "Heatmap by grade",
sl_dataname = "ADSL",
ex_dataname = "ADEX",
ae_dataname = "ADAE",
cm_dataname = "ADCM",
id_var = choices_selected(selected = "USUBJID",
choices = c("USUBJID", "SUBJID")),
visit_var = choices_selected(selected = "AVISIT",
choices = c("AVISIT")),
ongo_var = choices_selected(selected = "ongo_status",
choices = c("ongo_status")),
anno_var = choices_selected(selected = c("SEX", "COUNTRY"),
choices = c("SEX", "COUNTRY", "USUBJID")),
heat_var = choices_selected(selected = "AETOXGR",
choices = c("AETOXGR")),
conmed_var = choices_selected(selected = "CMDECOD",
choices = c("CMDECOD")),
plot_height = c(600, 200, 2000)
)
)
)
shinyApp(app$ui, app$server)
Provenance:
Creator: hey59
Hi @hey59 and @kpagacz , I was checking this earlier and I noticed that Konrad is trying to include this module in the osprey sample app. By doing the add_visit
steps for heatmap, it seems like we lost some records (for example, ADAE had 1934 records, but after add_visit call, only 1611 left). I'm just wondering if such preprocessing will affect other modules' output?
Provenance:
Creator: yli110-stat697
Hi @hey59 and @kpagacz , I was checking this earlier and I noticed that Konrad is trying to include this module in the osprey sample app. By doing the
add_visit
steps for heatmap, it seems like we lost some records (for example, ADAE had 1934 records, but after add_visit call, only 1611 left). I'm just wondering if such preprocessing will affect other modules' output?
My understanding is that, because the data in the example are randomly generated, not every record in ADAE has a corresponding AVISIT found in ADEX. For example, for subject 23, in ADAE
Thanks for the explanation @hey59! I think the case you describe is not just limited to the random data and can happen very often with real data too due to data entry errors and missed visits, etc. So re-defining visits across domains is very tricky to do.
Since it's likely that this module is used in an app with other modules that use ADAE data, the algorithm for deriving AVISIT should modify the number of ADAE records. This probably requires a little more work in the module itself so for now I will block issue to be addressed in the future.
Provenance:
Creator: anajens
The example does not work when changing the data to cached = TRUE.