Open spinkney opened 2 years ago
Having the erfcx
function in Stan will also help speed up the Wiener diffusion distribution and aid in the calculation of the cdf. See Even faster and even more accurate first-passage time densities and
distributions for the Wiener diffusion model by Matthias Gondan, Steven P. Blurton, Miriam Kesselmeier.
Also, if we implement a log_Phi
function then using erfcx
can be expanded for negative values. Below is the calculation for large negative values of log(Phi(x))
in stan-math which overflows and the calculation using erfcx
:
x <- -40
log(0.5) + log(my_erfc(-(1/sqrt(2)) * x))
[1] -Inf
log(0.5) + log(my_erfcx(-(1/sqrt(2)) * x)) - x^2 * 0.5
[1] -804.6084
This also can be used to make a log_erf
function as
// log(erf(x))
real log_erf (real x) {
return log_diff_exp(0, log(erfcx(x)) - x^2);
}
Here's log_Phi
expanded. It is equal to std_normal_lcdf
in the case of positive x (allows values of x up to 38) but expands negative x. Also see https://github.com/stan-dev/math/issues/2470.
log_Phi <- function(x) {
y <- log(0.5) + log(my_erfcx(-(1/sqrt(2)) * x)) - x^2 * 0.5
if ( x < 0) {
return(y)
} else {
y2 <- log(0.5) + log(my_erfcx((1/sqrt(2)) * x)) - x^2 * 0.5
return( log1mexp(-y2) )
}
}
log_Phi <- Vectorize(log_Phi)
Phi_stan <- Vectorize(Phi_stan)
v <- 0:20
data.table(k = v, log_phi = log_Phi(v), stan_log_phi = log(Phi_stan(v)))
k log_phi stan_log_phi
1: 0 -6.931472e-01 -6.931472e-01
2: 1 -1.727538e-01 -1.727538e-01
3: 2 -2.301291e-02 -2.301291e-02
4: 3 -1.350810e-03 -1.350810e-03
5: 4 -3.167174e-05 -3.167174e-05
6: 5 -2.866516e-07 -2.866516e-07
7: 6 -9.865876e-10 -9.865877e-10
8: 7 -1.279813e-12 -1.279865e-12
9: 8 -6.220961e-16 -6.661338e-16
10: 9 -1.128588e-19 0.000000e+00
11: 10 -7.619853e-24 0.000000e+00
12: 11 -1.910660e-28 0.000000e+00
13: 12 -1.776482e-33 0.000000e+00
14: 13 -6.117164e-39 0.000000e+00
15: 14 -7.793537e-45 0.000000e+00
16: 15 -3.670966e-51 0.000000e+00
17: 16 -6.388754e-58 0.000000e+00
18: 17 -4.105996e-65 0.000000e+00
19: 18 -9.740949e-73 0.000000e+00
20: 19 -8.527224e-81 0.000000e+00
21: 20 -2.753624e-89 0.000000e+00
> v <- -100:-30
> data.table(k = v, log_phi = log_Phi(v), stan_log_phi = log(Phi_stan(v)))
k log_phi stan_log_phi
1: -100 -5005.5242 -Inf
2: -99 -4906.0142 -Inf
3: -98 -4807.5040 -Inf
4: -97 -4709.9938 -Inf
5: -96 -4613.4834 -Inf
6: -95 -4517.9729 -Inf
7: -94 -4423.4623 -Inf
8: -93 -4329.9517 -Inf
9: -92 -4237.4408 -Inf
10: -91 -4145.9299 -Inf
11: -90 -4055.4189 -Inf
12: -89 -3965.9077 -Inf
13: -88 -3877.3964 -Inf
14: -87 -3789.8850 -Inf
15: -86 -3703.3734 -Inf
16: -85 -3617.8617 -Inf
17: -84 -3533.3499 -Inf
18: -83 -3449.8379 -Inf
19: -82 -3367.3258 -Inf
20: -81 -3285.8135 -Inf
21: -80 -3205.3011 -Inf
22: -79 -3125.7885 -Inf
23: -78 -3047.2758 -Inf
24: -77 -2969.7629 -Inf
25: -76 -2893.2498 -Inf
26: -75 -2817.7366 -Inf
27: -74 -2743.2232 -Inf
28: -73 -2669.7096 -Inf
29: -72 -2597.1958 -Inf
30: -71 -2525.6818 -Inf
31: -70 -2455.1676 -Inf
32: -69 -2385.6533 -Inf
33: -68 -2317.1387 -Inf
34: -67 -2249.6239 -Inf
35: -66 -2183.1088 -Inf
36: -65 -2117.5936 -Inf
37: -64 -2053.0781 -Inf
38: -63 -1989.5623 -Inf
39: -62 -1927.0463 -Inf
40: -61 -1865.5301 -Inf
41: -60 -1805.0136 -Inf
42: -59 -1745.4968 -Inf
43: -58 -1686.9797 -Inf
44: -57 -1629.4623 -Inf
45: -56 -1572.9446 -Inf
46: -55 -1517.4266 -Inf
47: -54 -1462.9083 -Inf
48: -53 -1409.3896 -Inf
49: -52 -1356.8706 -Inf
50: -51 -1305.3511 -Inf
51: -50 -1254.8314 -Inf
52: -49 -1205.3112 -Inf
53: -48 -1156.7906 -Inf
54: -47 -1109.2695 -Inf
55: -46 -1062.7481 -Inf
56: -45 -1017.2261 -Inf
57: -44 -972.7036 -Inf
58: -43 -929.1807 -Inf
59: -42 -886.6572 -Inf
60: -41 -845.1331 -Inf
61: -40 -804.6084 -Inf
62: -39 -765.0832 -Inf
63: -38 -726.5572 -Inf
64: -37 -689.0306 -689.0306
65: -36 -652.5032 -652.5032
66: -35 -616.9751 -616.9751
67: -34 -582.4462 -582.4462
68: -33 -548.9164 -548.9164
69: -32 -516.3856 -516.3856
70: -31 -484.8540 -484.8540
71: -30 -454.3212 -454.3212
Description
According to Wikipedia the exponentially modified Gaussian can be made more precise by a reparameterization and using the scaled erfc.
I have implemented the reparameterization and the issue below.
Example