Closed Rednose22 closed 1 month ago
r2rtf is designed to only handle table format with a data frame as is.
The requested features should be handled at data manipulation stage. So a pipe can be crated to first manipulate data using tidyverse or other approach.
For item 1, please use 'arrange' function in 'dplyer' other variables as needed.
For item 2 you can sort the variables and remove variables using 'select' function in 'dplyr' or other approach.
r2rtf is designed to only handle table format with a data frame as is.
The requested features should be handled at data manipulation stage. So a pipe can be crated to first manipulate data using tidyverse or other approach.
For item 1, please use 'arrange' function in 'dplyer' other variables as needed.
For item 2 you can sort the variables and remove variables using 'select' function in 'dplyr' or other approach.
Hi @elong0527, thanks for your quick reply. Regarding your suggestion, I understand r2rtf only focuses on converting the data frame as it is to rtf. But the issue here is in the line 189-191 of rtf_body()
, it will check if the sorting of data is consistent to the sorting by the the combination of subline_by
, page_by
, and group_by
. If it's not, the code will be stopped. But in the context of listing, it always has some cases that the sorting of data is not consistent to the group_by
(In the example above, subline_by
and page_by
are NULL).
Also, for your suggestion of item 2, it's same because if I sort the data frame with extra variables and remove them, the sorting of data will still be checked in rtf_body()
compared to the sorting by subline_by
, page_by
, and group_by
. And it would still potentially break the code.
In the example, it seems you want to have listings to separate "clinical important" and "clinical not important". You may want to separate the results into two listings with proper titles.
sort_by <- c("IMPORTANT", "STUDYID", "SITENUM", "COUNTRY", "SUBJID")
group_by <- c("STUDYID", "COUNTRY", "SITENUM", "USUBJID", "DVCAT", "DVTERM")
If you need to create a listing exactly like you suggest, group_by
did not fit for your purpose. One way you can do is to manipulate the data frame by replacing repeated values as NA
.
In the example, it seems you want to have listings to separate "clinical important" and "clinical not important". You may want to separate the results into two listings with proper titles.
sort_by <- c("IMPORTANT", "STUDYID", "SITENUM", "COUNTRY", "SUBJID") group_by <- c("STUDYID", "COUNTRY", "SITENUM", "USUBJID", "DVCAT", "DVTERM")
If you need to create a listing exactly like you suggest,
group_by
did not fit for your purpose. One way you can do is to manipulate the data frame by replacing repeated values asNA
.
Yes, 'clinical important' records need to be showed in the listing first and then 'non clinical important' records in one listing and there're more mockups with same issue in the real project. It would be a great new feature if it could be enhanced in r2rtf since it's not very straightforward to manipulate the data frame by replacing repeated values as NA
instead of using group_by
argument.
Could you provide a screenshot of the RTF output based on the dummy_data
data defined above?
Could you provide a screenshot of the RTF output based on the
dummy_data
data defined above?
Thanks a lot @elong0527
Here is code example to manipulate the data and create the exact table in the screenshot.
library(r2rtf)
# Create the dummy data frame
dummy_data <- data.frame(
STUDYID = c(101, 101, 101, 101, 102, 102, 102, 103, 103, 104, 104),
COUNTRY = c("USA", "USA", "CAN", "CAN", "GER", "GER", "GER", "FRA", "FRA", "JPN", "JPN"),
SITENUM = c(1001, 1001, 1002, 1002, 1003, 1003, 1003, 1004, 1004, 1005, 1005),
SUBJID = c("0001", "0002", "0001", "0002", "0001", "0002", "0003", "0001", "0002", "0001", "0002"),
USUBJID = c("101-1001-0001", "101-1001-0002", "101-1002-0001", "101-1002-0002", "102-1003-0001",
"102-1003-0002", "102-1003-0003", "103-1004-0001", "103-1004-0002", "104-1005-0001",
"104-1005-0002"),
DVCAT = c("Safety", "Efficacy", "Safety", "Efficacy", "Safety", "Efficacy", "Safety", "Efficacy",
"Safety", "Efficacy", "Safety"),
DVTERM = c("Headache", "Nausea", "Dizziness", "Vomiting", "Fatigue", "Rash", "Insomnia", "Anxiety",
"Headache", "Vomiting", "Nausea"),
DVSPID = c("001", "002", "003", "004", "005", "006", "007", "008", "009", "010", "011"),
IMPORTANT = c("Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No", "Yes"),
stringsAsFactors = FALSE
)
df <- dummy_data %>%
arrange(desc(IMPORTANT), STUDYID, SITENUM, COUNTRY, SUBJID) %>%
mutate(
across(c("STUDYID", "COUNTRY", "SITENUM", "USUBJID"), function(x){
if_else( c(FALSE, x[-n()] == c(x[-1])), NA, x)
})
)
orientation <- 'landscape'
page_size <- 16
title <- "Example of r2rtf"
colheader <- "Trial Number | Country | Site Number | Subject ID | Unique Subject Identifier | Deviation Category | Protocol Deviation Description | Protocol Deviation ID | Clinically Important"
rel_width <- c(14, 14, 12, 15, 20, 25, 62, 20, 15)
# Create RTF document with r2rtf
rtf <- df |>
r2rtf::rtf_page(orientation = orientation,
nrow = page_size) |>
r2rtf::rtf_title(title) |>
r2rtf::rtf_colheader(
colheader,
col_rel_width = rel_width,
cell_vertical_justification = "top"
) |>
r2rtf::rtf_body(
col_rel_width = rel_width,
text_convert = FALSE
)
rtf |>
rtf_encode() |>
write_rtf("tmp.rtf")
r2rtf::rtf_body()
will check if the data is sorted by the combination ofsubline_by
,page_by
, andgroup_by
, if not, it will break the code. But especially in the listing, the sorting of listing can't be consistent to the combination of these three variables. Is it possible to add one more argumentsort_by
in thertf_body()
, and if we define thesort_by
then we can suppress this bulletproof or any other approaches?Create the dummy data frame
dummy_data <- data.frame( STUDYID = c(101, 101, 101, 101, 102, 102, 102, 103, 103, 104, 104), COUNTRY = c("USA", "USA", "CAN", "CAN", "GER", "GER", "GER", "FRA", "FRA", "JPN", "JPN"), SITENUM = c(1001, 1001, 1002, 1002, 1003, 1003, 1003, 1004, 1004, 1005, 1005), SUBJID = c("0001", "0002", "0001", "0002", "0001", "0002", "0003", "0001", "0002", "0001", "0002"), USUBJID = c("101-1001-0001", "101-1001-0002", "101-1002-0001", "101-1002-0002", "102-1003-0001", "102-1003-0002", "102-1003-0003", "103-1004-0001", "103-1004-0002", "104-1005-0001", "104-1005-0002"), DVCAT = c("Safety", "Efficacy", "Safety", "Efficacy", "Safety", "Efficacy", "Safety", "Efficacy", "Safety", "Efficacy", "Safety"), DVTERM = c("Headache", "Nausea", "Dizziness", "Vomiting", "Fatigue", "Rash", "Insomnia", "Anxiety", "Headache", "Vomiting", "Nausea"), DVSPID = c("001", "002", "003", "004", "005", "006", "007", "008", "009", "010", "011"), IMPORTANT = c("Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No", "Yes"), stringsAsFactors = FALSE )
Define metadata for r2rtf
sort_by <- c("IMPORTANT", "STUDYID", "SITENUM", "COUNTRY", "SUBJID") group_by <- c("STUDYID", "COUNTRY", "SITENUM", "USUBJID", "DVCAT", "DVTERM") orientation <- 'landscape' page_size <- 16 title <- "Example of r2rtf" colheader <- "Trial Number | Country | Site Number | Subject ID | Unique Subject Identifier | Deviation Category | Protocol Deviation Description | Protocol Deviation ID | Clinically Important" rel_width <- c(14, 14, 12, 15, 20, 25, 62, 20, 15)
Order data by sort_by
dummy_data <- dummy_data[do.call(order, dummy_data[sort_by]), ]
Create RTF document with r2rtf
rtf <- dummy_data |> r2rtf::rtf_page(orientation = orientation, nrow = page_size) |> r2rtf::rtf_title(title) |> r2rtf::rtf_colheader( colheader, col_rel_width = rel_width, cell_vertical_justification = "top" ) |> r2rtf::rtf_body( col_rel_width = rel_width, text_convert = FALSE, group_by = group_by )
> Error in r2rtf::rtf_body(r2rtf::rtf_colheader(r2rtf::rtf_title(r2rtf::rtf_page(dummy_data, : Data is not sorted by STUDYID, COUNTRY, SITENUM, USUBJID, DVCAT, DVTERM
But it's not applicable in the r2rtf.