phuse-org / valtools

Validation framework for R packages used in clinical research and drug development.
https://phuse-org.github.io/valtools/
Other
51 stars 10 forks source link

internal function `scrape_roxygen` expects Last Updated By: to have no trailing spaces #144

Closed mariev closed 3 years ago

mariev commented 3 years ago

internal function scrape_roxygen attempts to transform legacy section tags "Last Updated By:" and "Last Updated Date" into "editor" and "editDate" for convenience usage. The transformation process appears to assume that the section tag line has no trailing eol characters e.g. an extra space, which does not always hold true. If there is a trailing space (for example), this will break the strsplit used via cleanup_section_last_update, which expects the split pattern: :\n, not : \n.

# load internal {valtools} functions
i Loading valtools

# file of interest
this_process_spec <- "process_specifications/specifications/Process_Specifications_001.Rmd" 

# encounter error initially
valtools:::scrape_roxygen(this_process_spec, type = "Rmd")
## Error in section_split[[2]] : subscript out of bounds 

# to debug
text <- readLines(this_process_spec)
text <-  roxy_text(text, file = this_process_spec, class = "rmd")
text2 <- roxy_text(
  c(text[grepl("^#'",text)], "NULL"),
  file = roxy_text_file(text),
  class = roxy_text_class(text)
)
roxyblocks <- roxygen2::parse_text(text2, env = NULL)
block <- roxyblocks[[1]]
section_tags <- block_get_tags(block = block, tags = "section")
tags <- section_tags[[1]]
section_split <- strsplit(tags[["val"]],":\n",fixed = TRUE)[[1]]

# error at https://github.com/phuse-org/valtools/blob/main/R/parse_roxygen.R#L296
selection <- section_split[[2]]
## Error in section_split[[2]] : subscript out of bounds

# to fix for this instance
section_split <- strsplit(tags[["val"]],": \n",fixed = TRUE)[[1]]
selection <- section_split[[2]]
selection
##[1] "Ellis Hughes"
thebioengineer commented 3 years ago

Updating line 296 of parse_roxygen.R should resolve this:

Update regex to "[:]\s*\n" and remove the fixed =TRUE argument