NaN density for transformed distributions when value is outside of support region

venpopov commented 3 months ago

It is standard to return 0 for the density when the value is outside of the support region. For example:

density(dist_lognormal(), -1) # gives 0
density(dist_lognormal(), 0) # gives 0

However, the following produces NaN:

density(exp(dist_wrap('norm')), -1) # NaN
density(exp(dist_wrap('norm')), 0) # NaN

This is because the numeric calculation of the derivative produces a NaN value.

There could be two ways to solve this:

1) Check if result is NaN before passing it to the user and replace with 0.

Quite easy, but I'm not sure if there might be cases where you wouldn't want to do it - cases for which NaN is produced for some other reason and it doesn't make sence to return 0. I can't think of any, but that doesn't mean there aren't

2) Check the limits of the support region and return 0 if value is outside

This can also work nicely and it will be more robust, since the support region is already calculated for the transformed distribution. However, currently the support function does not distinguish between open and closed ranges (assumed closed range)

support(exp(dist_wrap('norm')))
#> <support_region[1]>
#> [1] [0,Inf]

For this to fully work, the support function should differentiate between open and closed intervals

This seems super minor, but it is currently causing me problems when trying to plot transformed distributions via ggdist

mitchelloharawild commented 3 months ago

Definitely would prefer (2), while (1) is easier I think it could cause some unexpected issues with other distributions down the line. Would you be interested in making a PR for this?

venpopov commented 3 months ago

Sure ;)

mjskay commented 3 months ago

Something that could help with this might also be to allow the user to supply the derivatives of the transformation so that it does not need to be done with numerical differentiation. In some situations this will also already be available; e.g link functions defined by family() and ggplot scale functions (thanks to a PR I got in awhile ago). Scale transformations of distributions are much more reliable in ggdist now because it takes advantage of this, but it requires me to do the transformation myself instead of relying on distributional.

venpopov commented 3 months ago

I thought about the same as well. Either allow the user to supply the derivative function, or have a lookup table (or symbolic differentiation) similar for the inverse (or both, if the latter fails).

But by itself it' still not sufficient. For example, exp(dist_wrap('norm')) has a jacobian of 1/x. But the other problem is that then the inverse is log(x), and this gets calculated as well in the main density function, resulting in NaNs for negative values. So an explicit correction for the bounds is actually necessary. The symbolic derivatives will help still for some problematic behavior near the bounds.

venpopov commented 3 months ago

In the PR I just posted it works quite well now. For example, this is the output based on master:

library(ggplot2)
library(distributional)
library(ggdist)
dist <- exp(dist_student_t(10))
ggplot() +
  stat_slab(aes(xdist = dist), 
            limits = c(0,10),
            fill = NA,
            color = 'red')
#> Warning messages:
#> 1: In .Primitive("log")(x) : NaNs produced
#> 2: In .Primitive("log")(x) : NaNs produced
#> 3: In .Primitive("log")(x) : NaNs produced
#> 4: In .Primitive("log")(x) : NaNs produced

And this is from the PR:

And integrate fails based on master:

dens <- function(x) density(dist, x)[[1]]
integrate(dens, -Inf, Inf)
#> Error in integrate(dens, -Inf, Inf) : non-finite function value
#> In addition: There were 50 or more warnings (use warnings() to see the first 50)

whereas now it works almost perfectly (though I suspect not perfectly 1 because of the still weird behavior very near 0 - numDeriv gives NaNs for input values like 0.00001):

dens <- function(x) density(dist, x)[[1]]
integrate(dens, -Inf, Inf)
#> 0.9999998 with absolute error < 7.9e-05

venpopov commented 3 months ago

Alright, I actually got the symbolic differentiation working completely for arbitrary levels of nested transformations. And it's a bit faster...

newdist <- exp(dist_wrap('norm'))
x <- seq(0,10,0.01)
microbenchmark::microbenchmark(density(newdist, x, deriv_method = "symbolic"),
                               density(newdist, x, deriv_method = "numeric"))

#> Unit: microseconds
#>                                            expr       min        lq       mean  median        uq       max neval
#>  density(newdist, x, deriv_method = "symbolic")   175.234   185.238   222.9215   221.4   246.041   476.707   100
#>   density(newdist, x, deriv_method = "numeric") 28770.028 30157.468 31754.9719 30859.2 33455.303 43195.878   100

methods agree for simple cases

> density(newdist, 1)
Using symbolic differentiation
[1] 0.3989423
> density(newdist, 1, deriv_method = "numeric")
Using numerical differentiation.
[1] 0.3989423

numeric fails for boundary conditions but symbolic works

> density(newdist, 0.00001)
Using symbolic differentiation
[1] 6.585616e-25
> density(newdist, 0.00001, deriv_method = "numeric")
Using numerical differentiation.
[1] NaN

arbitrary nesting of functions and numeric operators:

> d2 <- expm1(exp(dist_gamma(1,1)^0.2))
> density(d2, 5, deriv_method = "symbolic")
Using symbolic differentiation
[1] 0.05029254
> density(d2, 5, deriv_method = "numeric")
Using numerical differentiation.
[1] 0.05029254
> ggplot() + stat_slab(aes(xdist = d2))
Using symbolic differentiation

I will make a separate PR for this tomorrow after I test it a bit more

mjskay commented 3 months ago

Sweet! As part of this, would it be possible to allow the user to pass the derivatives if known? I believe that would be the last piece necessary for me to offload all logic for density corrections for scale transformations from ggdist to distributional...

mitchelloharawild commented 3 months ago

Looks great, thanks! I agree with @mjskay that optionally storing known derivatives to transformed distributions (dist_transformed()) would be useful. Symbolic differentiation is something I was working toward so it's nice to see that it is working well here, this could potentially be the default.

venpopov commented 3 months ago

Yes, it already works to supply a user derivative function, to me that's also really important.

BTW, great work on both of your packages! I knew of them before, but now decided to use them in a package we are building and the integration is seamless. Thanks for being open to these changes!

mitchelloharawild commented 3 months ago

Great! Should I review/accept PR #98 or wait for the symbolic derivatives PR? I imagine the symbolic work builds upon #98?

venpopov commented 3 months ago

I suggest review 98 first. The symbolic derives do build on that in the sense that I branched off that branch, but otherwise they would work as is even if based on master. So functionality is pretty much independent

mitchelloharawild / distributional

NaN density for transformed distributions when value is outside of support region #97