atorus-research / Tplyr

https://atorus-research.github.io/Tplyr/
Other
95 stars 16 forks source link

Getting an error when i try to use the Tplyr for the table `adverse events by maximum severity` #143

Closed jagadishkatam closed 11 months ago

jagadishkatam commented 11 months ago

I am trying to develop the adverse events by maximum severity table where i will be using the ADAE dataframe. I am taking the maximum severity values per USUBJID, AEDECOD and AESEV. If i pass the data into Tplyr as below

dt <- Tplyr::tplyr_table(adae, TRTA) %>% 
  set_pop_data(adsl) %>% 
  set_pop_treat_var(TRTA) %>% 
  set_pop_where(TRUE) %>% 
    Tplyr::add_layer(group_count(vars(AEDECOD,AESEV)) %>% 
                     set_format_strings(f_str("xxx (xx.x%)", distinct_n, distinct_pct))) %>% 
  set_distinct_by(USUBJID) %>% 
  add_total_group() %>%
  Tplyr::build() 

I get the error as below

image

to want to get the output as below

image

any thoughts on how i can generate these type of tables using Tplyr

mstackhouse commented 11 months ago

Hi @jagadishkatam. I would suggest doing this using a by variable instead of a nested count layer.

dt <- tplyr_table(adae, TRTA) %>% 
  set_pop_data(adsl) %>% 
  set_pop_treat_var(TRTA) %>% 
  set_pop_where(TRUE) %>% 
  set_distinct_by(USUBJID) %>% 
  add_total_group() %>%
    add_layer(
      group_count(AEDECOD, by = AESEV) %>% 
        set_format_strings(f_str("xxx (xx.x%)", distinct_n, distinct_pct))
  ) %>% 
  build()

It's not going to give that exact presentation - but with #129 it would allow you to post process into this format.

Does this help?

jagadishkatam commented 11 months ago

Thank you @mstackhouse for your prompt response, I tried your apporach of using by and it created the row_label1 and row_label2, now since i wanted to parse row_label1 and row_label2 , a post processing is followed.

dt <- Tplyr::tplyr_table(adae, TRTA) %>% 
  set_pop_data(adsl) %>% 
  set_pop_treat_var(TRTA) %>% 
  set_pop_where(TRUE) %>% 
  Tplyr::add_layer(group_count(AESEV, by=all) %>% 
                     set_format_strings(f_str("xxx (xx.x%)", distinct_n, distinct_pct))) %>% 
  set_distinct_by(USUBJID) %>% 
  Tplyr::add_layer(group_count(AESEV,by=AEDECOD) %>% 
                     set_format_strings(f_str("xxx (xx.x%)", distinct_n, distinct_pct))) %>% 
  set_distinct_by(USUBJID) %>% 
  add_total_group() %>%
  Tplyr::build() 

firstrow <- dt[dt$ord_layer_2==1,c('row_label1','ord_layer_1','ord_layer_2')]
firstrow$ord_layer_2 <- 0

dt <- bind_rows(dt, firstrow) %>% mutate(ord_layer_1=ifelse(row_label1=='Subjects with any Adverse Events',0,ord_layer_1))
dt <- dt %>% mutate(row_label1=ifelse(!is.na(row_label2), paste(' ', row_label2),row_label1)) %>% arrange(ord_layer_1,ord_layer_2,row_label2)

it results in

image

one thing i am now struck with is about the page breaking by AEDECOD, which i am unable to understand as highlight in blue. could you please let me know your thoughts

mstackhouse commented 11 months ago

@jagadishkatam looks great! My plan is to introduce a function that can avoid that post processing.

The page breaking is out of scope of Tplyr itself and depends on the package that you're using. What package are you using for display?

jagadishkatam commented 11 months ago

Thank you @mstackhouse , I am using the reporter package

mstackhouse commented 11 months ago

You would have to look into the documentation for reporter https://reporter.r-sassy.org/index.html