Open jdstallings opened 2 years ago
Hi Jonathan:
1) Can you show me your code?
2) Can you mock up in paint or Word or something the output you are trying to achieve?
David
Sir, thanks for your time. Your work is this area is so appreciated!
DS <- tibble::tibble(ARM = c("Active", "Active", "Active", "Active", "Active", "Active", "Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Active", "Active", "Active"), SUBJID = c("01-301", "01-301", "01-301", "01-302", "01-302", "01-302", "01-303", "01-303", "01-303", "01-304", "01-304", "01-304", "01-305", "01-305", "01-305"), ASRE = c("78/M/N/W", "78/M/N/W", "78/M/N/W", "78/M/N/W", "78/M/N/W", "78/M/N/W", "82/M/N/W", "82/M/N/W", "82/M/N/W", "75/M/N/W", "75/M/N/W", "75/M/N/W", "53/M/N/W", "53/M/N/W", "53/M/N/W"), DSTERM = rep(c("Eligibility criteria met", "Double-masked phase complete", "Open-label phase complete"),5), DSCAT = rep("Protocol Milestone", 15) )
tbl <- reporter::create_table(DS) %>% reporter::titles("Disposition Table", "Example", "Subtitle") %>% reporter::define(ARM, blank_after = TRUE, dedupe = TRUE) %>% reporter::define(SUBJID, blank_after = TRUE, dedupe = TRUE) %>% reporter::define(ASRE, blank_after = TRUE, dedupe = TRUE) %>% reporter::column_defaults(., from = ARM, to = ASRE, width = 1) %>% reporter::column_defaults(., from = DSTERM, to = DSTERM, width = 2.5) %>% reporter::column_defaults(., from = DSCAT, to = DSCAT, width = 1.5) %>% logr::put()
outpath <- "/Users/jdstallings/Google Drive/Data_InDeed/projects/oGvHD-2-SC/analysis/tlf/week12/2022-06-18_l_subject_evaluability_week12.pdf"
logr::sep("## Define report object for PDF") rpt <- reporter::create_report(outpath, output_type = "PDF", orientation = "landscape", missing = "") %>% reporter::add_content(tbl, align = "center") %>% logr::put()
logr::sep("## Write reports to file system") reporter::write_report(rpt)
Sys.getenv() __CF_USER_TEXT_ENCODING 0x1F5:0x0:0x0 __CFBundleIdentifier org.rstudio.RStudio CLICOLOR_FORCE 1 COMMAND_MODE unix2003 DISPLAY /private/tmp/com.apple.launchd.PXR3jvbMKs/org.macosforge.xquartz:0 DYLD_FALLBACK_LIBRARY_PATH /Library/Frameworks/R.framework/Resources/lib:/Users/jdstallings/lib:/usr/local/lib:/usr/lib:/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server:/var/folders/72/7m_j_z5165l8dpky_m1z27qc0000gn/T/rstudio-fallback-library-path-d06YE8 EDITOR vi GIT_ASKPASS rpostback-askpass HOME /Users/jdstallings LANG en_US.UTF-8 LC_CTYPE en_US.UTF-8 LN_S ln -s LOGNAME jdstallings MAKE make MPLENGINE tkAgg PAGER /usr/bin/less PATH /usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin:/opt/X11/bin:/usr/local/MacGPG2/bin:/Library/TeX/texbin::/Applications/RStudio.app/Contents/MacOS/postback PYTHONIOENCODING utf-8 R_BROWSER /usr/bin/open R_BZIPCMD /usr/bin/bzip2 R_DOC_DIR /Library/Frameworks/R.framework/Resources/doc R_GZIPCMD /usr/bin/gzip R_HOME /Library/Frameworks/R.framework/Resources R_INCLUDE_DIR /Library/Frameworks/R.framework/Resources/include R_LIBS_SITE
R_LIBS_USER /Users/jdstallings/Google Drive/Data_InDeed/projects/oGvHD-2-SC/renv/library/R-4.0/x86_64-apple-darwin17.0:/private/var/folders/72/7m_j_z5165l8dpky_m1z27qc0000gn/T/RtmpXITUUb/renv-system-library R_PAPERSIZE a4 R_PDFVIEWER /usr/bin/open R_PLATFORM x86_64-apple-darwin17.0 R_PRINTCMD lpr R_QPDF /Library/Frameworks/R.framework/Resources/bin/qpdf R_RD4PDF times,inconsolata,hyper R_SESSION_TMPDIR /var/folders/72/7m_j_z5165l8dpky_m1z27qc0000gn/T//RtmpXITUUb R_SHARE_DIR /Library/Frameworks/R.framework/Resources/share R_STRIP_SHARED_LIB strip -x R_STRIP_STATIC_LIB strip -S R_SYSTEM_ABI macos,gcc,gxx,gfortran,gfortran R_TEXI2DVICMD /usr/local/bin/texi2dvi R_UNZIPCMD /usr/bin/unzip R_ZIPCMD /usr/bin/zip RENV_DEFAULT_R_ENVIRONRENV_DEFAULT_R_ENVIRON_USER RENV_DEFAULT_R_LIBS RENV_DEFAULT_R_LIBS_SITE
RENV_DEFAULT_R_LIBS_USER ~/Library/R/4.0/library RENV_DEFAULT_R_PROFILERENV_DEFAULT_R_PROFILE_USER RENV_PROJECT /Users/jdstallings/Google Drive/Data_InDeed/projects/oGvHD-2-SC RETICULATE_MINICONDA_PYTHON_ENVPATH /Users/jdstallings/Google Drive/Data_InDeed/projects/oGvHD-2-SC/renv/python/r-reticulate RETICULATE_PYTHON /usr/local/bin/python3 RMARKDOWN_MATHJAX_PATH /Applications/RStudio.app/Contents/Resources/resources/mathjax-27 RS_PPM_FD_READ 8 RS_PPM_FD_WRITE 9 RS_RPOSTBACK_PATH /Applications/RStudio.app/Contents/MacOS/rpostback RS_SHARED_SECRET 2764227788305013851740555842 RSTUDIO 1 RSTUDIO_CONSOLE_COLOR 256 RSTUDIO_CONSOLE_WIDTH 121 RSTUDIO_FALLBACK_LIBRARY_PATH /var/folders/72/7m_j_z5165l8dpky_m1z27qc0000gn/T/rstudio-fallback-library-path-d06YE8 RSTUDIO_PANDOC /Applications/RStudio.app/Contents/MacOS/quarto/bin RSTUDIO_PROGRAM_MODE desktop RSTUDIO_SESSION_PORT 19477 RSTUDIO_USER_IDENTITY jdstallings RSTUDIO_WINUTILS bin/winutils SDKROOT /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk SED /usr/bin/sed SHELL /bin/zsh SSH_ASKPASS rpostback-askpass SSH_AUTH_SOCK /private/tmp/com.apple.launchd.tyMNSUUJMd/Listeners TAR /usr/bin/tar TERM xterm-256color TMPDIR /var/folders/72/7m_j_z5165l8dpky_m1z27qc0000gn/T/ TZDIR /usr/share/zoneinfo USER jdstallings XPC_FLAGS 0x0 XPC_SERVICE_NAME application.org.rstudio.RStudio.364465833.364465853
Here is the actual script I'm using in the real listing:
################################################################################ ################################################################################
################################################################################ ################################################################################
library(sassy) library(magrittr)
base::options("logr.autolog" = TRUE, "logr.notes" = FALSE)
base::rm(.reports, logpath, lgpth, datafile, dirpath, outpath, population, visitnum, parmcd, select.cols, filter.rows, arrange.cols, group.by, contrasts.ls, blc, display, express, lhdr, rhdr, title, title2, subtitle, program, disclaimer, notes, notes1, notes2, notes3, notes4, notes5, notes6, dat, rc, dm, ie, tbl, rpt, rpt_rft, cols, lbls) %>% logr::put()
.reports <- data.frame(read_excel(paste0(.sys, .proj, "/analysis/tlf/week12/", "specifications_week12.xlsx"), sheet = "specifications")) i = 3 ## Item Number
logr::sep("Specifications") logr::put("## DIRECTORY WITH THE LOGS/AUDIT REPORTS")
logpath = paste0(.sys, .proj, .logs, "/", Sys.Date(), "_", .reports$Log[i]) lgpth <- logr::logopen(logpath) %>% logr::put() logr::put("## FILE WITH EXTENSION TO BE ANALYZED") datafile = .reports$Dataset[i] %>% logr::put() logr::put("## DIRECTORY WITH THE DATASET TO BE ANALYZED") dirpath = paste0(.sys, .proj, .sdtm, "/") %>% logr::put() logr::put("## DIRECTORY TO THE RESULTS/REPORT PRINT OUT") outpath = paste0(.sys, .proj, "/analysis/tlf/week12/", Sys.Date(), "", .reports$Output[i]) %>% logr::put() logr::put("## SELECTED POPULATION FOR ANALYSIS") population = unlist(.reports$Population[i]) %>% logr::put() logr::put("## VISIT NUMBERS TO BE ANALYZED") visitnum = unlist(str_split(.reports$VISITNUM[i], ";")) %>% logr::put() logr::put("## PARAMETER SHORT NAME TO BE ANALYZED") parmcd = unlist(str_split(.reports$PARMCD[i], ";")) %>% logr::put() logr::put("## SELECTED COLUMNS") select.cols = unlist(str_split(.reports$Select[i], ";")) %>% logr::put() logr::put("## FILTERED ROWS") filter.rows = unlist(str_split(.reports$Filter[i], ";")) %>% logr::put() logr::put("## ARRANGE OUTPUT FROM THE TABLE") arrange.cols = unlist(str_split(.reports$Arrange[i], ";")) %>% logr::put() logr::put("## GROUP BY VARIABLES") group.by = unlist(str_split(.reports$Group[i], ";")) %>% logr::put() logr::put("## CONTRASTS IS THE LIST OF CONTRASTS TO REPORT") contrasts.ls = unlist(str_split(.reports$Contrasts[i], ";")) %>% logr::put() logr::put("## BASELINE CORRECTED") blc = .reports$ABLFL[i] %>% logr::put() logr::put("## DISPLAY PARAMETERS") display = unlist(str_split(.reports$Display[i], ";")) %>% logr::put() logr::put("## EXPRESSIONS TO MODIFY OUTPUT") express = unlist(str_split(.reports$Express[i], ";")) %>% logr::put() logr::put("## LHDR IS THE LEFT SIDE HEADER INFO") lhdr = .reports$Sponsor[i] %>% logr::put() logr::put("## RHDR IS THE RIGHT SIDE HEADER INFO") rhdr = .reports$Study[i] %>% logr::put() logr::put("## TITLE FOR THE TABLE GENERATED") title = paste0(.reports$TLFTYPE[i], ": ", .reports$TLFNUMBER[i], " ", .reports$Title[i]) %>% logr::put() put("## SECOND TITLE FOR THE TABLE GENERATED") title2 = ifelse(is.null(.reports$Title2[i]),.reports$Title2[i], " ") %>% logr::put() put("## SECOND TITLE FOR THE TABLE GENERATED") subtitle = ifelse(is.null(.reports$Subtitle[i]),.reports$Subtitle[i], " ") %>% logr::put() logr::put("## NAME OF THE PROGRAM USED TO CREATE THE TABLE") program = paste0(.reports$Program[i], " Location: ", .proj, .prog) %>% logr::put() logr::put("## DISCLAIMER IS THE CENTER FOOTER INFO") disclaimer = .reports$Disclaimer[i] %>% logr::put() logr::put("## FOOTNOTES") notes = ifelse(.reports$Notes[i] == "NULL", " ",.reports$Notes[i]) %>% logr::put() logr::put("## FOOTNOTES1") notes1 = ifelse(.reports$Notes1[i] == "NULL", " ",.reports$Notes1[i]) %>% logr::put() logr::put("## FOOTNOTES2") notes2 = ifelse(.reports$Notes2[i] == "NULL", " ",.reports$Notes2[i]) %>% logr::put() logr::put("## FOOTNOTES3") notes3 = ifelse(.reports$Notes3[i] == "NULL", " ",.reports$Notes3[i]) %>% logr::put() notes4 = ifelse(.reports$Notes4[i] == "NULL", " ",.reports$Notes4[i]) %>% logr::put() notes5 = ifelse(.reports$Notes5[i] == "NULL", " ",.reports$Notes5[i]) %>% logr::put() notes6 = ifelse(.reports$Notes6[i] == "NULL", " ",.reports$Notes6[i]) %>% logr::put()
logr::sep("## Get Data Step")
logr::put("## Define data library") libr::libname(sdtm, dirpath, "xpt")
logr::put("## Load library into workspace") libr::lib_load(sdtm)
logr::sep("Make Dataset Step") DS <- sdtm$DS %>% dplyr::select(USUBJID, DSTERM, DSDECOD, DSCAT, DSSTDTC, EPOCH) %>% dplyr::filter(DSSTDTC != "NA-NA-NA") %>% dplyr::filter(DSTERM != "DISPOSITION EVENT") %>% dplyr::filter(ifelse(is.na(DSTERM) & DSCAT == "Option", FALSE, TRUE))
DM <- di_dm_combine(sdtm$DM) %>% dplyr::select(USUBJID, SUBJID, ARM, ASRE)
dat <- plyr::join(DM, DS, type = "left", match = "all")
logr::sep("Assign Column Variables and Entries") dat <- dat %>% select(ARM, SUBJID, ASRE, DSTERM, DSCAT, DSSTDTC, EPOCH)
dat$DSTERM[dat$DSTERM == "Yes" & dat$DSCAT == "Eligibility Criteria (IE)"] <- "Eligibility criteria met" dat$DSTERM[dat$DSTERM == "No" & dat$DSCAT == "Eligibility Criteria (IE)"] <- "Eligibility criteria not met" dat$DSTERM[dat$DSCAT == "End of Double-masked Phase"] <- "Double-masked phase complete" dat$DSTERM[dat$DSTERM == "No" & dat$DSCAT == "Opt-in to Open Label Extension Phase"] <- "Open-label extension phase opt-out" dat$DSTERM[dat$DSTERM == "Yes" & dat$DSCAT == "Opt-in to Open Label Extension Phase"] <- "Open-label extension phase opt-in" dat$DSTERM[dat$DSTERM %in% c("Daniel Brocks, MD","Danie Brocks, MD", "Alan Kwok, OD") & dat$DSCAT == "Opt-in to Open Label Extension Phase"] <- "Open-label phase complete" dat$DSTERM[dat$DSTERM == "No" & dat$DSCAT == "Opt-in to Continue Open Label Extension Phase"] <- "Open-label extension phase opt-out" dat$DSTERM[dat$DSTERM == "Yes" & dat$DSCAT == "Opt-in to Continue Open Label Extension Phase"] <- "Open-label extension phase opt-in to continue"
dat <- dat %>% dplyr::filter(!(DSTERM %in% c("Daniel Brocks, MD","Danie Brocks, MD", "Alan Kwok, OD") & DSCAT %in% "Opt-in to Continue Open Label Extension Phase"))
for (i in 1:length(dat$DSTERM)) { opts <- .options[.options$GNAME %in% "DSTERM",]
if (dat$DSTERM[i] %in% opts$NAME) {
dat$DSCAT[i] <- opts$CAT[which(opts$NAME %in% dat$DSTERM[i])]
}
}
dat$DSCAT[] <- sapply(1:length(dat$DSCAT), function(x) ifelse(dat$DSCAT[x] %in% c("Eligibility Criteria (IE)", "End of Double-masked Phase", "Opt-in to Open Label Extension Phase", "Opt-in to Continue Open Label Extension Phase"), "Protocol Milestone", dat$DSCAT[x]))
dat <- dat %>% dplyr::filter(DSTERM != "")
logr::sep("Assign Column Labels and Make Clean Week12 Dataset") lbls <- Hmisc::label(dat) cols <- base::colnames(dat) dat <- di_clean(dat, cols, lbls) %>% dplyr::group_by(., USUBJID)
logr::sep("Create Table with Defined Features and Variables") tbl <- reporter::create_table(dat) %>% reporter::titles(title, title2, subtitle) %>% reporter::define(ARM, blank_after = TRUE, dedupe = TRUE) %>% reporter::define(var = SUBJID, blank_after = TRUE, dedupe = TRUE) %>% reporter::define(ASRE, blank_after = TRUE, dedupe = TRUE) %>% reporter::column_defaults(., from = ARM, to = ASRE, width = 1) %>% reporter::column_defaults(., from = DSTERM, to = DSTERM, width = 2.5) %>% reporter::column_defaults(., from = DSCAT, to = DSCAT, width = 1.5) %>% reporter::column_defaults(., from = DSSTDTC, to = EPOCH, width = 1) %>% reporter::footnotes(notes, notes1, notes2, align = "left") %>% logr::put()
logr::sep("## Define report object for PDF") rpt <- reporter::create_report(outpath, output_type = "PDF", orientation = "landscape", missing = "") %>% reporter::page_header(lhdr, rhdr) %>% reporter::add_content(tbl, align = "center") %>% reporter::footnotes(paste0("Program: ", program)) %>% reporter::page_footer(Sys.time(), disclaimer, "Page [pg] of [tpg]") %>% logr::put()
logr::sep("## Define report object for RTF") outpath2 <- base::gsub(".pdf", ".rtf", outpath) rpt_rtf <- reporter::create_report(outpath2, output_type = "RTF", orientation = "landscape", missing = "") %>% reporter::page_header(lhdr, rhdr) %>% reporter::add_content(tbl, align = "center") %>% reporter::footnotes(paste0("Program: ", program)) %>% reporter::page_footer(Sys.time(), disclaimer, "Page [pg] of [tpg]") %>% logr::put()
logr::sep("## Write reports to file system") reporter::write_report(rpt) reporter::write_report(rpt_rtf)
libr::lib_unload(sdtm)
logr::log_close()
There is no way to do this in reporter directly. But since you are using libr package anyway, I modified your code to do a quick datastep to perform the deduping in the manner you desire:
library(sassy) library(magrittr)
DS <- tibble::tibble(ARM = c("Active", "Active", "Active", "Active", "Active", "Active", "Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Active", "Active", "Active"), SUBJID = c("01-301", "01-301", "01-301", "01-302", "01-302", "01-302", "01-303", "01-303", "01-303", "01-304", "01-304", "01-304", "01-305", "01-305", "01-305"), ASRE = c("78/M/N/W", "78/M/N/W", "78/M/N/W", "78/M/N/W", "78/M/N/W", "78/M/N/W", "82/M/N/W", "82/M/N/W", "82/M/N/W", "75/M/N/W", "75/M/N/W", "75/M/N/W", "53/M/N/W", "53/M/N/W", "53/M/N/W"), DSTERM = rep(c("Eligibility criteria met", "Double-masked phase complete", "Open-label phase complete"),5), DSCAT = rep("Protocol Milestone", 15) )
DS_grouped <- datastep(DS, by = "SUBJID", { if (!first.) { ARM <- "" ASRE = "" } })
tbl <- reporter::create_table(DS_grouped) %>% reporter::titles("Disposition Table", "Example", "Subtitle") %>% reporter::define(ARM) %>% reporter::define(SUBJID, blank_after = TRUE, dedupe = TRUE) %>% reporter::define(ASRE) %>% reporter::column_defaults(., from = ARM, to = ASRE, width = 1) %>% reporter::column_defaults(., from = DSTERM, to = DSTERM, width = 2.5) %>% reporter::column_defaults(., from = DSCAT, to = DSCAT, width = 1.5) %>% logr::put()
outpath <- "./output/2022-06-18_l_subject_evaluability_week12.pdf" logr::sep("## Define report object for PDF") rpt <- reporter::create_report(outpath, output_type = "PDF", orientation = "landscape", missing = "") %>% reporter::add_content(tbl, align = "center") %>% logr::put() logr::sep("## Write reports to file system") reporter::write_report(rpt)
The output is this:
I'm super happy you are using these packages in the manner intended. Please feel free to contact me directly at dbosak01@gmail.com. I'm very interested in learning what people are doing in real life. It helps my package development so much.
Also I'll think about how to add a feature like this to reporter. This is something proc report can do, and I think you are right that people will expect the same for reporter.
When I use the dedupe feature, it works concerning the specified column only. However, I'd like to dedupe based on a grouping variable, such as subjid.
In the attached file, you see the dedupe works great for subjid because each subject has a unique id. However, the dedupe for the treatment arm works through the group_by variable subjid. I'd like the dedupe to only occur within the group_by variable.
You see a similar issue with the age/sex/ethnicity/race where two subjects that happen to have the same input is deduped.
In both cases, I'd like the dedupe to occur with respect to the group_by variable subjid, but I can not figure this out with the online guidance.