Bad plot for LogNormal with small σ

JuliaPlots / StatsPlots.jl

Statistical plotting recipes for Plots.jl

Other

437 stars 88 forks source link

Bad plot for LogNormal with small σ #467

Open knuesel opened 2 years ago

knuesel commented 2 years ago

The "automatic" plot for a LogNormal distribution with small σ looks pretty bad:

plot(
    plot(LogNormal(0, 1e-3), title="Default"),
    plot(LogNormal(0, 1e-3), xlim=(0.8, 1.2), title="Manual x limits"),
    plot(0.8:0.001:1.2, x->pdf(LogNormal(0, 1e-2), x), title="Expected"),
    size=(600, 600), layout=(3,1)
    )

sethaxen commented 2 years ago

StatsPlots uses a heuristic to determine the bounds over which a distribution should be plotted: https://github.com/JuliaPlots/StatsPlots.jl/blob/b99a4ff6c760f0e1275bb54572c94da03a7bc81f/src/distributions.jl#L3-L7

For bounded variables, the bound is included. This heuristic works most of the time, but it breaks down in cases like this, where the distributions is concentrated far away from the bound, relative to the width of the distribution.

It might make sense to modify the heuristic to compute the bounds one would get using the quantiles and the bounds one would get using the finite bounds and choose the former if it is below some threshold fraction of the latter.

sethaxen commented 2 years ago

The downside to the suggested change is that it's discontinuous. This could cause problems when recording animations or plotting interactively with Pluto, where suddenly the limits drastically change. It might be better to smoothly interpolate between the two ranges once the one using the quantiles is significantly smaller than the one using the bounds.

mkborregaard commented 2 years ago

The issue here doesn't seem to be with the bounds though, but rather an issue with the adapted_grid inference of internal points to realise the function over? I'm puzzled by the second plot above. Manually passing limits should solve these issues also in terms of animations etc.

knuesel commented 2 years ago

I guess it's both: the bound is also an issue, because the current heuristic always takes 0 for the left bound and that will never give a good looking plot for e.g. LogNormal(6, 0.001) (which has mean ~400 and standard deviation ~0.4).

mkborregaard commented 2 years ago

Yes what I mean is that reason the second plot looks weird isn't the issue with the bounds (which I agree is an issue) and I'm not sure what causes it then.

sethaxen commented 2 years ago

The second plot looks weird because it's only changing the limits, which I think is done after the plot is drawn. So it's just taking the first plot and changing the axes. The reason the first plot is bad is I believe for the reasons I mentioned.