mjskay / ggdist

Visualizations of distributions and uncertainty
https://mjskay.github.io/ggdist/
GNU General Public License v3.0
833 stars 25 forks source link

Mode calculated incorrectly #236

Open fkohrt opened 2 months ago

fkohrt commented 2 months ago

The following code is supposed to display a spike at the density's mode, but that's apparently not the case.

data.frame(x = datasets::AirPassengers) |>
  ggplot2::ggplot(ggplot2::aes(x = x)) +
  ggdist::stat_slab(ggplot2::aes(height = max(ggplot2::after_stat(pdf)))) +
  ggdist::stat_spike(
    ggplot2::aes(height = max(ggplot2::after_stat(pdf))),
    at = ggdist::Mode
  )

Rplot

bwiernik commented 1 month ago

The function is detecting the x data as discrete because the values are integers. As a result, it is computing the discrete mode (single most frequent value), rather than the continuous maximum a posterior (MAP) value (the highest point in the estimated data distribution).

I've opened a PR here that adds an argument to override this default detection and force computing either the discrete or continuous estimator https://github.com/mjskay/ggdist/pull/240

library(tidyverse)
library(ggdist)

dat <- data.frame(x = as.numeric(datasets::AirPassengers))

data.frame(x = datasets::AirPassengers) |>
  ggplot2::ggplot(ggplot2::aes(x = x)) +
  ggdist::stat_slab(ggplot2::aes(height = max(ggplot2::after_stat(pdf)))) +
  ggdist::stat_spike(
    ggplot2::aes(height = max(ggplot2::after_stat(pdf))),
    at = Mode
  )
#> Don't know how to automatically pick scale for object of type <ts>. Defaulting
#> to continuous.


data.frame(x = datasets::AirPassengers) |>
  ggplot2::ggplot(ggplot2::aes(x = x)) +
  ggdist::stat_slab(ggplot2::aes(height = max(ggplot2::after_stat(pdf)))) +
  ggdist::stat_spike(
    ggplot2::aes(height = max(ggplot2::after_stat(pdf))),
    at = \(x) Mode(x, type = "continuous")
  )
#> Don't know how to automatically pick scale for object of type <ts>. Defaulting
#> to continuous.

Created on 2024-07-18 with reprex v2.0.2