MakieOrg / Makie.jl

Interactive data visualizations and plotting in Julia
https://docs.makie.org/stable
MIT License
2.42k stars 313 forks source link

Feature Request: Easy Fan Plots #1995

Open ParadaCarleton opened 2 years ago

ParadaCarleton commented 2 years ago

A fan-plot is a visually-appealing way to display many confidence (or credible) intervals at once. This makes it easier to look at the uncertainty as a continuous whole, which helps counter bad habits like discounting any values that fall outside a 95% confidence interval. Here's a good example of a fan plot: image

It'd be great if these were easier to make!

asinghvi17 commented 2 years ago

Could you post some code of how you deal with it right now? To my perspective this seems appropriate:

# example code from https://makie.juliaplots.org/stable/examples/plotting_functions/band/#:~:text=color%20%3D%20%3Ared)%0A%0Af-,using%20Statistics,-using%20CairoMakie%0A%0Af
f = Figure()
Axis(f[1, 1])

n, m = 100, 101
t = range(0, 1, length=m)
X = cumsum(randn(n, m), dims = 2)
X = X .- X[:, 1]
μ = vec(mean(X, dims=1)) # mean
lines!(t, μ)              # plot mean line
σ = vec(std(X, dims=1))  # stddev

# end example code
# plot decreasing "confidence intervals" - replace this with real confint calculations
for i in 1:4 # sigma/i
    band!(t, μ + σ ./ i, μ - σ ./ i, color = (:red, 0.3))
end
f

iTerm2 q7Hy1D

Basically, this "stacks" transparency so that each succeeding layer appears more and more opaque. Thus we go from outside (large width) to inside (small width).

ParadaCarleton commented 2 years ago

Could you post some code of how you deal with it right now? To my perspective this seems appropriate:

# example code from https://makie.juliaplots.org/stable/examples/plotting_functions/band/#:~:text=color%20%3D%20%3Ared)%0A%0Af-,using%20Statistics,-using%20CairoMakie%0A%0Af
f = Figure()
Axis(f[1, 1])

n, m = 100, 101
t = range(0, 1, length=m)
X = cumsum(randn(n, m), dims = 2)
X = X .- X[:, 1]
μ = vec(mean(X, dims=1)) # mean
lines!(t, μ)              # plot mean line
σ = vec(std(X, dims=1))  # stddev

# end example code
# plot decreasing "confidence intervals" - replace this with real confint calculations
for i in 1:4 # sigma/i
    band!(t, μ + σ ./ i, μ - σ ./ i, color = (:red, 0.3))
end
f

iTerm2 q7Hy1D

Basically, this "stacks" transparency so that each succeeding layer appears more and more opaque. Thus we go from outside (large width) to inside (small width).

Oh wow, that's great! Maybe this should be in a tutorial?

ParadaCarleton commented 2 years ago

Actually, thinking about it, this kind of stacking might be a problem for users who also want to create a legend; is there a way to add one to make it clear what each color represents?

asinghvi17 commented 2 years ago

Hmm, that's a good point. In the legend each plot would be displayed at an equal transparency.

One could certainly create a custom legend element instead of the automatic one, with the same stacking behaviour (by stacking PolyElement or similar).

jkrumbiegel commented 2 years ago

You can also precompute the colors you get here by stacking with alpha by blending red with white in different ratios. That would then look correct in the legend automatically.

ParadaCarleton commented 2 years ago

Another comment: continuous fan plots can be a big improvement on ones that show only a handful of intervals. Examples can be found here.

asinghvi17 commented 2 years ago

This kind of thing should be easily doable with surface. Consider the following code (yes, a little long :D): Viridis colormap hello2 :rainbow_bgyr_35_85_c72_n256 colormap hello3

# Fan plot replication

using KernelDensity, Trapz
# construct data
xs = LinRange(0, 1, 100)

y_data = [randn(1000) .* (1+x) .+ x^2 .+ x * 0.3 for x in xs]

# helper functions

function cumtrapz(x, y)
    return [trapz(@view(x[begin:i]), @view(y[begin:i])) for i in axes(x, 1)]
end

function grid_kde(data, N = 100; min_cutoff=0.05)
    k = KernelDensity.kde(data)
    ct = cumtrapz(k.x, k.density)
    minind = findfirst(>(min_cutoff), ct)
    maxind = findlast(<(1-min_cutoff), ct)
    xs = LinRange(k.x[minind], k.x[maxind], N)  
    return (xs, pdf.(Ref(InterpKDE(k)), xs))
end

# process data to extract "distributions"
N = 200
x_vals = zeros(N, length(xs))
y_vals = zeros(N, length(xs))
y_colors = zeros(N, length(xs))

for i in 1:length(xs)
    x_vals[:, i] .= xs[i]
    y_vals[:, i], y_colors[:, i] = grid_kde(y_data[i], N; min_cutoff = 0.01)
    # uncomment the following if you do not want absolute coloring
    # y_colors[:, i] ./= maximum(@view(y_colors[:, i]))
end

fig = Figure()
ax = Axis(fig[1, 1])
plt = surface!(ax, x_vals, y_vals; color = y_colors, shading = false)
fig
asinghvi17 commented 1 year ago

Here's an interesting plot I made with this (to try and see a continuous confidence interval): download-9

ffreyer commented 2 months ago

Is this something we want to support as a plot type or something that users should do on their own (i.e. document in BeautifulMakie or here as a more complex usage example)?

asinghvi17 commented 2 months ago

Probably something users can do on their own, though I'd like to leave the issue open for documentation in that case. The recipe is super specific.

aplavin commented 2 months ago

A basic version of a discrete fan plot is easy to achieve with generic functions and recipes already: image Hard to see any potential way for improvement here...

Btw, even more convenient with multiplot (experimental) from MakieExtra: image