Closed bwiernik closed 3 years ago
Right, I think there are a fixed number of bins (8) and some people actually find it confusing that a "_" basically forces them to a width of 8 even if narrower. So I think that if we were going to change that function, we should also deal with that.
That said, you can always create your own version of the function.
It actually already has an nbins option, so you could modify your slf and use the same function with a different bin width.
https://github.com/ropensci/skimr/blob/master/R/stats.R#L84
new_hist <- function(x){
inline_hist(x, 10)
}
my_skim <- skim_with(numeric = sfl(new_hist))
my_skim(iris)
skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 1 Sepal.Length 0 1 5.84 0.828 4.3 5.1 5.8 6.4 2 Sepal.Width 0 1 3.06 0.436 2 2.8 3 3.3 3 Petal.Length 0 1 3.76 1.77 1 1.6 4.35 5.1 4 Petal.Width 0 1 1.20 0.762 0.1 0.3 1.3 1.8 p100 hist new_hist
1 7.9 ▆▇▇▅▂ ▂▇▅▇▆▆▅▂▂▂ 2 4.4 ▁▆▇▂▁ ▁▁▃▃▇▃▂▂▁▁ 3 6.9 ▇▁▆▇▂ ▇▃▁▁▂▆▆▃▂▁ 4 2.5 ▇▁▇▅▃ ▇▂▁▂▅▃▁▅▂▃
Would you be open to exposing the nbins argument in the default skim() function?
Thanks for the suggestion! Unfortunately, this isn't really how skimr
works. We have a way to customize the hist function that works the same way for all of the other functions that skimr
uses.
The histograms in skim() are really narrow, making it very difficult to discern the data distribution. By comparison, the
precis()
function in the rethinking package (https://github.com/rmcelreath/rethinking) supplies histograms that are about 3 times as wide as the skim() histograms, and these are much more readable. Do you think you could make the histograms wider or provide something like abins
argument to specify how wide to make them?