jthomasmock / gtExtras

A Collection of Helper Functions for the gt Package.
https://jthomasmock.github.io/gtExtras/
Other
195 stars 27 forks source link

Feature request: option to limit the x axis in gt_sparkline() #10

Closed erdirstats closed 3 years ago

erdirstats commented 3 years ago

Hi. It would be great to have an option to limit the x axis, to avoid the long tails, when using the type = "density" option within gt_sparkline(). The option same_limit = FALSE doesn't always solve the problem and when it does, it has the side effect of making comparisons between rows very difficult. An example where this feature might be useful is shown in the image: image

jthomasmock commented 3 years ago

Thanks for submitting!

Please see the new behavior for gt_sparkline(), refactored to use a density() calc internally. This has a better raw implementation of a density.

You can install with remotes::install_github("jthomasmock/gtExtras")

library(gt)
library(dplyr, warn.conflicts = FALSE)
library(purrr)

set.seed(37)
fake_data <- tibble(
  x = sprintf("%000d", sample(1:300)),
  group = rep(c("grp-1", "grp-2", "grp-3"), each = 100),
  data = purrr::map2(c(10, 20, 30), c(1, 2, 3), ~rnorm(100, .x, .y)) %>% 
    unlist()
)

fake_data %>%
  group_by(group) %>%
  dplyr::summarize(
    mean = mean(data),
    sd = sd(data),
    list_data = list(data), .groups = "drop") %>%
  gt() %>%
  gtExtras::gt_sparkline(list_data, type = "density") %>% 
  gtsave("test-table.png")

A screenshot of the table created with the above code, note that the table has a better implementation of density plots, without the long tails

Created on 2021-10-02 by the reprex package (v2.0.1)

erdirstats commented 3 years ago

Unfortunately, this doesn't seem to solve the issue when the tail of the density plot is just long (which happens a lot in my case).

jthomasmock commented 3 years ago

Gotcha. So the approach that the ggplot2 team takes is available via stat_density(trim = TRUE)

Reference source code available at: https://github.com/tidyverse/ggplot2/blob/master/R/stat-density.r#L88-L93

I have chosen a slightly expanded range() via scales::expand_range(). Note that this will essentially limit the density to the observed x-range + 5% on each end rather than the simple range() in ggplot2.

I've pushed version gtExtras v0.2.15, feel free to take it for a spin and if the trim is ideal.

library(gt)
library(dplyr, warn.conflicts = FALSE)
library(purrr)

set.seed(37)
fake_data <- tibble(
  x = sprintf("%000d", sample(1:300)),
  group = rep(c("grp-1", "grp-2", "grp-3"), each = 100),
  data = purrr::map2(c(10, 20, 30), c(1, 2, 3), ~rnorm(100, .x, .y)) %>% 
    unlist()
)

fake_data %>%
  dplyr::group_by(group) %>%
  dplyr::summarize(
    mean = mean(data),
    sd = sd(data),
    list_data = list(data), .groups = "drop") %>%
  gt() %>%
  gtExtras::gt_sparkline(list_data, type = "density", trim = TRUE) %>% 
  gtsave("test-table.png")

Created on 2021-10-04 by the reprex package (v2.0.1)

erdirstats commented 3 years ago

Thanks a lot! It looks nice! Maybe the trim is a bit too much when there are no long tails, but in overall it would work I guess. tab

jthomasmock commented 3 years ago

Cheers! Yah, I'm not a big fan of trimming but took the same approach as ggplot so it's at least consistent.