Closed jxu closed 4 months ago
Thanks for the suggestion!
We should be able to support this through custom skim functions. Here's an example:
my_skim <- skim_with(character = get_sfl('factor'), append = FALSE)
data.frame(affiliations = c("Dem", "Dem", "Rep", "Rep", "Ind", "Lib")) |>
my_skim() |>
print()
Nice solution. I think it is a suitable option.
Readr's
read_csv
reads all strings as characters, with nostringsAsFactors
switch. This is fine, but when using skimr I've found treating the strings as factors almost always gives more useful results: min/max are string lengths, which isn't useful for categorical levels, while top_counts is much more handy. Maybe there should be astrings_like_factors
switch which can quickly report top counts instead of min/max length and counts of empty/whitespace. Otherwise the current dplyr method to convert all chars to factors is something likedf %>% mutate(across(where(is.character), factor))
.