Clean obsolete files in R, tests or vignettes

statnmap commented 3 years ago

Validation

During inflate_all():

[x] I would like a function at the scale of my package that tells me if a file in R, tests or vignettes is not legitimate : not coming from a flat file, or specifically registered by the user
[x] I would like this same function to ask me if it can delete this non-legitimate files. It asks with a question or use the parameter clean = TRUE to delete without question.
[x] I would like a function that detects and informs, during the inflate of a specific flat file, if I am about to make obsolete some R, test or vignettes files.
[x] If I change the name of the main function of a function chunk, a new R, tests files will be created. I would like the R and tests files of the older name to be deleted when I inflate my flat file, whether I am asked or directly with clean = TRUE

Tech

Development will mainly take place in : "dev/flat_clean_fusen_files.Rmd"

Level of the package

Modify check_not_registered_files() to ask if: you want to delete all or you want to register all (hence, run register_all_to_config()) or register only a part of it ?
Create a specific function named delete_all_not_registered() for the delete part
If the user want to "register only a part of it", for now, the message would be : modify the csv file "config_not_registered.csv" and then we need to tell them to run delete_all_not_registered() after that.
- Maybe we can make this interactive in a way/

Level of the inflate of one flat file

Modify the df_to_config(flat_file_path = relative_flat_file) when used during inflate() so that the clean = TRUE parameter also asks to clean R, tests or vignettes
- For now, it only delete the part of the "fusen_config.yml" that listed previous version and update it. It does not deal with the files in R, tests or vignettes created before.

MargotBr commented 2 years ago

Hello @statnmap,

I don't know if this helps, but here is the code I produced to identify the "obsolete" functions.

It compares the functions present in the "flat" files (using the names of the chunks r function-...) and the functions present in the R/ directory.

I excluded the functions from the package that were not built with flat files (notably functions related to the shiny app, and utils functions).

identify_fct_in_flat <- function(name_flat_rmd) {

  flat_in_line <- readLines(file.path("dev", name_flat_rmd))

  names_fct_in_flat <- flat_in_line[which(stringr::str_detect(flat_in_line, "```\\{r function-"))] %>%
    stringr::str_remove_all("```\\{r function-") %>%
    stringr::str_remove_all("\\}")

  return(names_fct_in_flat)

}

fct_in_flat_files <- list.files("dev")[list.files("dev") %>% stringr::str_detect("^flat")] %>%
  purrr::map(identify_fct_in_flat) %>%
  unlist() %>%
  stringr::str_c(".R")

fct_in_r_folder <- list.files("R")

obsolete_fcts <- fct_in_r_folder[!(fct_in_r_folder %in% fct_in_flat_files)]
obsolete_fcts <- obsolete_fcts[!grepl("app|mod_|zzz|listofcodes-package|utils-pipe|welcome_banner", obsolete_fcts)]
obsolete_fcts

ColinFay commented 2 years ago

Hey,

Quick thoughts on this code :

you can create function that do no match the function- chunk name

    ```{r function-https, file.name = "https"}
#' HTTP req
#'
#' @importFrom cli cli_alert_success cli_alert_danger
#' @importFrom httr GET POST add_headers status_code content
#'
#' @return Un objet response de {httr}, lisible avec httr::content()
#' @noRd
#' @rdname HTTP
health_check <- function(
  url
) {
  # [...]
}


- you can have several functions in the same chunk 

```r
    ```{r function-https, file.name = "https"}
#' HTTP req
#'
#' @importFrom cli cli_alert_success cli_alert_danger
#' @importFrom httr GET POST add_headers status_code content
#'
#' @return Un objet response de {httr}, lisible avec httr::content()
#' @noRd
#' @rdname HTTP
health_check <- function(
  url
) {
  # [...]
}

#' @noRd
#' @rdname HTTP
delete_ <- function(url, expected_status_code = 200) {
  # [...]
}

statnmap commented 2 years ago

A function to detect if there are duplicate names in "R/".
This requires to extract parse_fun_vec from parse_fun() that is already in {fusen}

parse_fun_vec <- function(code) {
  # code <- unlist(rmd_node_code(x[["ast"]]))
  regex_isfunction <- paste("function(\\s*)\\(", "R6Class(\\s*)\\(", 
                            sep = "|")
  regex_extract_fun_name <- paste("[\\w[.]]*(?=(\\s*)(<-|=)(\\s*)function)", 
                                  "[\\w[.]]*(?=(\\s*)(<-|=)(\\s*)R6Class)", "[\\w[.]]*(?=(\\s*)(<-|=)(\\s*)R6::R6Class)", 
                                  sep = "|")
  fun_name <- stringi::stri_extract_first_regex(code[grep(regex_isfunction, 
                                                          code)], regex_extract_fun_name) %>% gsub(" ", "", .)
  code <- gsub(pattern = "#'\\s*@", "#' @", code)
  first_function_start <- grep(regex_isfunction, code)[1]
  all_hastags <- grep("^#'", code)
  if (length(all_hastags) != 0) {
    last_hastags_above_first_fun <- max(all_hastags[all_hastags < 
                                                      first_function_start])
  }
  else {
    last_hastags_above_first_fun <- NA
  }
  if (!any(grepl("@export|@noRd", code))) {
    if (!is.na(last_hastags_above_first_fun)) {
      code <- c(code[1:last_hastags_above_first_fun], "#' @noRd", 
                code[(last_hastags_above_first_fun + 1):length(code)])
    }
    else if (all(grepl("^\\s*$", code))) {
      code <- character(0)
    }
    else if (!is.na(first_function_start)) {
      code <- c("#' @noRd", code)
    }
  }
  all_arobase <- grep("^#'\\s*@|function(\\s*)\\(", code)
  example_pos_start <- grep("^#'\\s*@example", code)[1]
  example_pos_end <- all_arobase[all_arobase > example_pos_start][1] - 
    1
  example_pos_end <- ifelse(is.na(example_pos_end), grep("function(\\s*)\\(", 
                                                         code) - 1, example_pos_end)
  tag_filename <- gsub("^#'\\s*@filename\\s*", "", code[grep("^#'\\s*@filename", 
                                                             code)])
  tag_rdname <- gsub("^#'\\s*@rdname\\s*", "", code[grep("^#'\\s*@rdname", 
                                                         code)])
  rox_filename <- c(tag_filename, tag_rdname)[1]
  code[grep("^#'\\s*@filename", code)] <- "#'"
  tibble::tibble(fun_name = fun_name[1], code = list(code), 
                 example_pos_start = example_pos_start, example_pos_end = example_pos_end, 
                 rox_filename = rox_filename)
}

get_duplicate_functions <- function(path = "R") {
  all_r <- list.files(path, full.names = TRUE)
  all_funs <- lapply(all_r, function( one_r) {
    # one_r <- all_r[1]
    r_lines <- readLines(one_r)
    parse_fun_vec(r_lines)
  })
  res <- do.call("rbind", all_funs)
  res_clean <- res[!is.na(res[["fun_name"]]),]
  res_dups <- res_clean[["fun_name"]][duplicated(res_clean[["fun_name"]])]

  if (length(res_dups) != 0) {
    message('There are duplicated function names:', paste(res_dups, collapse = ", "))
  }
  list(
    all_funs = res_clean,
    duplicated = res_dups
  )
}
}

statnmap commented 7 months ago

Since 'fusen' registers files created during inflate in the config_fusen file, it knows what was the name of the created files in the previous inflate. Hence, we can ask to 'clean' these files directly.
By default, it asks when such a deleted filename occurs, but you can specifiy clean=TRUE in the inflate options.
I recommand to use inflate_all() too, to take care of functions that you moved from one flat to the other, so that you do not forget to inflate each file modified.

ThinkR-open / fusen