ropensci / visdat

Preliminary Exploratory Visualisation of Data
https://docs.ropensci.org/visdat/
Other
450 stars 47 forks source link

Fix percent column labels inconsistency #162

Closed zeehio closed 1 year ago

zeehio commented 1 year ago

Thanks for this package. I find the labelling of column percentages inconsistent when the percentage is between 0.1 and 0.5%.

Here is a reproducible example of the issue, with a plot. Please compare the miss_0.2 label with the labels next to it:

library(visdat)
x <- data.frame(
  miss_0 = rep(1, 10000),
  miss_0.08 = c(rep(NA_real_, 8), rep(1, 10000-8)),
  miss_0.2 = c(rep(NA_real_, 20), rep(1, 10000-20)),
  miss_0.8 = c(rep(NA_real_, 80), rep(1, 10000-80)),
  miss_3.4 = c(rep(NA_real_, 340), rep(1, 10000-340)),
  miss_23 =  c(rep(NA_real_, 2300), rep(1, 10000-2300))
)
vis_miss(x, sort_miss = TRUE)

Created on 2023-02-28 with reprex v2.0.2

Before:

Issue: The behaviour of percentages between 0.1% and 0.5% is confusing. For instance, a percentage of "0.2%" appeared rounded to units, and was labelled as "0%", as if no missing values were found.

Solution:

An alternative to this solution is to choose that percentages <1% become "<1%". That would also be a good solution. If you prefer that feel free to say so or just commit the change and merge at will.

With this pull request applied:

# remotes::install_github("zeehio/visdat@patch-1")
library(visdat)
x <- data.frame(
  miss_0 = rep(1, 10000),
  miss_0.08 = c(rep(NA_real_, 8), rep(1, 10000-8)),
  miss_0.2 = c(rep(NA_real_, 20), rep(1, 10000-20)),
  miss_0.8 = c(rep(NA_real_, 80), rep(1, 10000-80)),
  miss_3.4 = c(rep(NA_real_, 340), rep(1, 10000-340)),
  miss_23 =  c(rep(NA_real_, 2300), rep(1, 10000-2300))
)
vis_miss(x, sort_miss = TRUE)

Created on 2023-02-28 with reprex v2.0.2

njtierney commented 1 year ago

Hello!

This is a very lovely PR - thank you so much for taking the time to detail it out as you have!

This catches a new bug, and I'm really grateful that you've caught it and also given a nice elegant solution 😄

I can do these things if you would prefer (and am happy to!) but would you be able to add the following:

Thank you again!

zeehio commented 1 year ago

Thanks I will make those changes (and using "<1%" looks better than "0.3%" so I will change that as well).

It's been a busy week, I will try to get this done either today or next week.

Have a great weekend!

zeehio commented 1 year ago

Done. Feel free to merge :+1:

njtierney commented 1 year ago

Marvellous, thank you!