tomaszaba / ipccheckr

Toolkit for Performing IPC Acute Malnutrition-related Data Checks
GNU General Public License v3.0
2 stars 0 forks source link

revised code #12

Closed tomaszaba closed 3 months ago

tomaszaba commented 3 months ago

Hi Ernest,

Please review this PR.

The updates

  1. I am using embracing {{ }} to address the indirection, hence allow variables to called inside function without quotation marks. On this, for function arguments that takes NULL as default, the arguments still need to be given within "". So far this happens in process_age().
  2. Related to embracing, there is a note I get from R CMD Check that I still need to find out how to fix.
  3. I have revised function check_sample_size(). I just changed a bit as the way it was written was not working properly, it was not returning a data frame, so I tweaked to do that.
  4. Have removed all return calls
  5. Have condensated function documentations. Here, in quality_classifiers.R I kept the documentation as it is as I think there are some specificities.
  6. Changed the verb return to generate and revised the spelling.
  7. Changed stop() with match.arg().

What I could not revise?

You had suggested to revise rename with rename_with. According to my readings, using rename_with() seemed to be more handy when I want to change the pattern of columns, using toupper, etc. However, in my case, I am changing each column with specific details. While this could be achieved this way:

data <- data %>%
  rename_with(~ new_names[.], all_of(names(new_names)))

That would require from me create a an object storing the variable names in the same rename() syntax order new_name = old_name like below:

# Create a named vector with the new column names
new_names <- c(
  flagged = "Flagged data (%)",
  flagged_class = "Class. of flagged data",
  sex_ratio = "Sex ratio (p)",
  sex_ratio_class = "Class. of sex ratio",
  age_ratio = "Age ratio (p)",
  age_ratio_class = "Class. of age ratio",
  dps = "DPS (#)",
  dps_class = "Class. of DPS",
  sd = "Standard Dev* (#)",
  sd_class = "Class. of standard dev",
  skew = "Skewness* (#)",
  skew_class = "Class. of skewness",
  kurt = "Kurtosis* (#)",
  kurt_class = "Class. of kurtosis",
  quality_score = "Overall score",
  quality_class = "Overall quality"
)

So I thought this was almost the same as I already have in my current functions, then I decided to leave it as it, or test other ways. I went for this:

    generate_pretty_table_whz <- function(df){

        df <- df |>
          mutate(
            flagged = .data$flagged |>
              label_percent(accuracy = 0.1, suffix = "%", decimal.mark = ".")(),
            sex_ratio = .data$sex_ratio |>
              label_pvalue()(),
            age_ratio = .data$age_ratio |>
              label_pvalue()(),
            sd = round(.data$sd, digits = 2),
            dps_wgt = round(.data$dps_wgt),
            dps_hgt = round(.data$dps_hgt),
            skew = round(.data$skew, digits = 2),
            kurt = round(.data$kurt, digits = 2)
          )

        ## Rename Columns ----
        new_names <- c(
          "area", "n", "Flagged data (%)", "Class. of flagged data",
          "Sex ratio (p)", "Class. of sex ratio", "Age ratio (p)", "Class. of age ratio",
          "DPS weight (#)", "Class. DPS weight", "DPS height (#)", "Class. DPS height",
          "Standard Dev* (#)", "Class. of standard dev", "Skewness* (#)",
          "Class. of skewness", "Kurtosis* (#)", "Class. of kurtosis", "Overall score",
          "Overall quality"
        )

        colnames(df) <- new_names
        df
    }

But it returns just a list of column.

Please feel free to revise the function and I will learn how to effectively achieve that

tomaszaba commented 3 months ago

Hi Ernest,

Please review this PR.

The updates

  1. I am using embracing {{ }} to address the indirection, hence allow variables to called inside function without quotation marks. On this, for function arguments that takes NULL as default, the arguments still need to be given within "". So far this happens in process_age().
  2. Related to embracing, there is a note I get from R CMD Check that I still need to find out how to fix.
  3. I have revised function check_sample_size(). I just changed a bit as the way it was written was not working properly, it was not returning a data frame, so I tweaked to do that.
  4. Have removed all return calls
  5. Have condensated function documentations. Here, in quality_classifiers.R I kept the documentation as it is as I think there are some specificities.
  6. Changed the verb return to generate and revised the spelling.
  7. Changed stop() with match.arg().

What I could not revise?

You had suggested to revise rename with rename_with. According to my readings, using rename_with() seemed to be more handy when I want to change the pattern of columns, using toupper, etc. However, in my case, I am changing each column with specific details. While this could be achieved this way:

data <- data %>%
  rename_with(~ new_names[.], all_of(names(new_names)))

That would require from me create a an object storing the variable names in the same rename() syntax order new_name = old_name like below:

# Create a named vector with the new column names
new_names <- c(
  flagged = "Flagged data (%)",
  flagged_class = "Class. of flagged data",
  sex_ratio = "Sex ratio (p)",
  sex_ratio_class = "Class. of sex ratio",
  age_ratio = "Age ratio (p)",
  age_ratio_class = "Class. of age ratio",
  dps = "DPS (#)",
  dps_class = "Class. of DPS",
  sd = "Standard Dev* (#)",
  sd_class = "Class. of standard dev",
  skew = "Skewness* (#)",
  skew_class = "Class. of skewness",
  kurt = "Kurtosis* (#)",
  kurt_class = "Class. of kurtosis",
  quality_score = "Overall score",
  quality_class = "Overall quality"
)

So I thought this was almost the same as I already have in my current functions, then I decided to leave it as it, or test other ways. I went for this:

  generate_pretty_table_whz <- function(df){

      df <- df |>
        mutate(
          flagged = .data$flagged |>
            label_percent(accuracy = 0.1, suffix = "%", decimal.mark = ".")(),
          sex_ratio = .data$sex_ratio |>
            label_pvalue()(),
          age_ratio = .data$age_ratio |>
            label_pvalue()(),
          sd = round(.data$sd, digits = 2),
          dps_wgt = round(.data$dps_wgt),
          dps_hgt = round(.data$dps_hgt),
          skew = round(.data$skew, digits = 2),
          kurt = round(.data$kurt, digits = 2)
        )

      ## Rename Columns ----
      new_names <- c(
        "area", "n", "Flagged data (%)", "Class. of flagged data",
        "Sex ratio (p)", "Class. of sex ratio", "Age ratio (p)", "Class. of age ratio",
        "DPS weight (#)", "Class. DPS weight", "DPS height (#)", "Class. DPS height",
        "Standard Dev* (#)", "Class. of standard dev", "Skewness* (#)",
        "Class. of skewness", "Kurtosis* (#)", "Class. of kurtosis", "Overall score",
        "Overall quality"
      )

      colnames(df) <- new_names
      df
  }

But it returns just a list of column.

Please feel free to revise the function and I will learn how to effectively achieve that

Hi Ernest, Regarding the rename,

this approach

generate_pretty_table_muac <- function(df) {

  ## Format data frame ----
  df <- df |>
    mutate(
      flagged = .data$flagged |>
        label_percent(accuracy = 0.1, suffix = "%", decimal.mark = ".")(),
      sex_ratio = .data$sex_ratio  |>  scales::label_pvalue()(),
      sd = round(.data$sd, digits = 2),
      dps = round(.data$dps)
    ) |>
    ## Rename columns ----
  setNames(
    c("Flagged data (%)", "Class. of flagged data", "Sex ratio (p)", 
      "Class. of sex ratio", "DPS(#)", "Class. of DPS", "Standard Dev* (#)",
      "Class. of standard dev"
  ))
  ## Return data frame ----
  df
}

Should simplify the function. Is just that I need to assure that the order of appearance is correct.

Let me know what you think of this.