rubenarslan / codebook

Cook rmarkdown codebooks from metadata on R data frames
https://rubenarslan.github.io/codebook/
Other
142 stars 16 forks source link

unique values list as optional parameter to codebook_table #74

Open datapumpernickel opened 2 years ago

datapumpernickel commented 2 years ago

Hi,

thanks for this really cool package! Sometimes in codebooks, it can be helpful to have a comma separated list of unique values in the dataframe.

e.g. a dataframe like this:

data.frame(geo = c("GBR","DEU","FRA","GBR"), value = c(1,2,3,4))

Could have a unique values output for character values, that defaults to all unique values: "GBR, DEU, FRA" and when the string exceeds a certain amount of options cut after x values or x str_length() and display something like: "GBR, DEU, [...]", whereas numeric values could display something like: "1-4", etc. This would be extremely helpful for data-frame with large numbers of variables that are not labelled in any form.

Now, please try to ignore the horrible way this is written, but this is a quick work-around that I am using right now, which obviously is not very robust...

split_dataframe <- data.frame(geo = c("GBR","DEU","FRA","GBR"), 
                       value = c(1,2,3,4)) %>% 
  mutate(across(.cols = everything(),as.character)) %>% 
  pivot_longer(cols =everything(), names_to = "names", values_to= "values") %>% 
  group_split(names) 

extract_unique_value <- function(df, max_length ){
  df <- df %>% filter(!is.na(values))
  if(all(!is.na(as.numeric(df$values)))){
    unique_values <- paste0(min(df$values)," - ",max(df$values))
  } else {
    unique_values <- paste0(unique(df$values) %>% sort(), collapse = ", ")
    if(str_length(unique_values)>max_length){
      unique_values <- paste0(substr(unique_values, 1,max_length)," [...]")
    }

  }
  return(unique_values)
}

map2(split_dataframe , 200,extract_unique_value)

Please let me know if I am overlooking some function in this package.

Thanks a bunch!