Feature request: Owen's t function

maximerischard commented 4 years ago

Owen's T function currently does not seem to have a native julia implementation. Most existing implementations in Fortran or C are under GPL, which makes this a little tricky.

This would be useful for the Skew Normal distribution in Distributions.jl.

azev77 commented 4 years ago

@maximerischard @johnczito There is a PR here: https://github.com/JuliaMath/SpecialFunctions.jl/pull/242 Distributions.jl calls both StatsFuns.jl & SpecialFunctions.jl

johnczito commented 4 years ago

Oh, what do you know. Thanks for the heads up. If that gets merged and released, we can close this and you can open the PR in Distributions.

blackeneth commented 3 years ago

Looks like JuliaMath/SpecialFunctions.jl#242 flamed out due to licensing issues -- the code it was derived from was LGPL.

I can work on this next, basing it on Fast and Accurate Calculation of Owen's T Function, which won't have a license issue (The Journal of Statistical Software uses the Creative Commons License).

I'm working on getting the MVN cdf (qsmivnv.jl) over the finish line, and the Johnson Distributions for Distributions.jl right now. Next I can turn my attention to Owen's T.

azev77 commented 3 years ago

@blackeneth that would be awesome! Looking forward to it!

blackeneth commented 3 years ago

@azev77 and others ...

History In the past 20 or so years, most implementations of Owen's T function have followed the algorithms given in "Fast and accurate Calculation of Owen's T-Function", by M. Patefield and D. Tandy, Journal of Statistical Software, 5 (5), 1 - 25 (2000)

Six algorithms were given, and which one used depended on the values of (h,a) in owent(h,a)

T1: first m terms of series expansion of Owen (1956)
T2: approximates 1/(1+x^2) by power series expansion up to order 2m
T3: approximates 1/(1+x^2) by chebyshev polynomials of degree 2m in x
T4: new expression for zi from T2
T5: Gauss 2m-point quadrature; 30 figures accuracy with m=48 (p. 18)
T6: For when a is very close to 1, use formula derived from T(h,1) = 1/2 Φ(h)[1-Φ(h)]

They developed code for these algorithms on a DEC VAX 750. The VAX 750 came out in 1980 and had a processor clock speed of 3.125 MHz.

The reason for 6 algorithms was to speed up the function when possible, with T1 being faster than T2, T2 faster than T3, etc.

TODAY

We have faster and better computers today.

If we skip T1 through T4, we can go straight to 48 point Gauss-Legendre quadrature for a <= 0.999999. For a > 0.999999, we can fall back to quadrature using QuadGK. This is sufficient for 15 digit accuracy for Float64. Gauss-Legendre quadrature takes about 3 microseconds, and QuadGK takes about 20 microseconds.

Questions: 1) Any objection to this? 2) do we need BigFloat support?

Code

I call this function "owensteasy(h,a)":

# owenteasy
#    Written by Andy Gough; August 2021 
#    Rev 1.02
Using QuadGK
Using Distributions

function owenteasy(h::T1,a::T1; calc_method=nothing, m=nothing) where {T1 <: Real}

    #*********************
    # shortcut evaluations 
    #*********************

    if h < zero(h) 
        return owenteasy(abs(h),a)
    end 

    if h==zero(h) 
        return arctan(a)*inv2π
    end

    if a < zero(a) 
        return -owenteasy(h,abs(a))
    end 

    if a==zero(a)
        return zero(a)
    end 

    if a==one(a) 
        return 0.125*erfc(-h*invsqrt2)*erfc(h*invsqrt2)
    end 

    if a==Inf 
        return 0.25*erfc(sqrt(h^2)*invsqrt2)
    end 

    # below reduces the range from -inf < h,a < +inf to h ≥ 0, 0 ≤ a ≤ 1
    unitnorm = Normal()
    if a > one(a) 
        return 0.5*cdf(unitnorm,h) + 0.5*cdf(unitnorm,a*h) - cdf(unitnorm,h)*cdf(unitnorm,a*h) - owenteasy(a*h,one(a)/a)
    end 

    # calculate Owen's T 

    if a ≤ 0.999999 

        t2(h,a,x) = inv2π*(a/2)*exp(-0.5*(h^2)*(1+(a*x)^2))/(1+(a*x)^2)

        x = [-0.9987710072524261, -0.9935301722663508, -0.9841245837228269, -0.9705915925462473, -0.9529877031604309, -0.9313866907065543, -0.9058791367155696
        , -0.8765720202742479, -0.8435882616243935, -0.8070662040294426, -0.7671590325157404, -0.7240341309238146, -0.6778723796326639, -0.6288673967765136
        , -0.5772247260839727, -0.5231609747222331, -0.4669029047509584, -0.4086864819907167, -0.34875588629216075, -0.28736248735545555, -0.22476379039468905
        , -0.1612223560688917, -0.0970046992094627, -0.03238017096286937, 0.03238017096286937, 0.0970046992094627, 0.1612223560688917, 0.22476379039468905
        , 0.28736248735545555, 0.34875588629216075, 0.4086864819907167, 0.4669029047509584, 0.5231609747222331, 0.5772247260839727, 0.6288673967765136
        , 0.6778723796326639, 0.7240341309238146, 0.7671590325157404, 0.8070662040294426, 0.8435882616243935, 0.8765720202742479, 0.9058791367155696
        , 0.9313866907065543, 0.9529877031604309, 0.9705915925462473, 0.9841245837228269, 0.9935301722663508, 0.9987710072524261]

        w = [0.0031533460523059122, 0.0073275539012762885, 0.011477234579234613, 0.015579315722943824, 0.01961616045735561, 0.023570760839324363
        , 0.027426509708356944, 0.031167227832798003, 0.03477722256477052, 0.038241351065830737, 0.04154508294346467, 0.0446745608566943, 0.04761665849249045
        , 0.05035903555385445, 0.05289018948519363, 0.055199503699984116, 0.05727729210040322, 0.05911483969839564, 0.06070443916589387, 0.06203942315989268
        , 0.06311419228625402, 0.06392423858464813, 0.06446616443594998, 0.06473769681268386, 0.06473769681268386, 0.06446616443594998, 0.06392423858464813
        , 0.06311419228625402, 0.06203942315989268, 0.06070443916589387, 0.05911483969839564, 0.05727729210040322, 0.055199503699984116
        , 0.05289018948519363, 0.05035903555385445, 0.04761665849249045, 0.0446745608566943, 0.04154508294346467, 0.038241351065830737, 0.03477722256477052
        , 0.031167227832798003, 0.027426509708356944, 0.023570760839324363, 0.01961616045735561, 0.015579315722943824, 0.011477234579234613, 0.0073275539012762885
        , 0.0031533460523059122]

        TOwen = dot(w, t2.(h,a,x))

        return TOwen
    else
        # a > 0.999999, use quadrature 

        tt(h,x) = inv2π*exp(-0.5*(h^2)*(1.0+x^2))/(1.0+x^2)

        TOwen, err = quadgk(x -> tt(h,x),0.0,a, atol=eps())

        return TOwen
    end 
end

You can test it with this code:

# owenteasy test values
using Random
using Test 

hvec = [0.0625, 6.5, 7.0, 4.78125, 2.0, 1.0, 0.0625, 1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5, 0.25, 0.25, 0.25, 0.25, 0.125, 0.125, 0.125, 0.125, 0.0078125
, 0.0078125, 0.0078125, 0.0078125, 0.0078125, 0.0078125, 0.0625, 0.5, 0.9]
avec = [0.25, 0.4375, 0.96875, 0.0625, 0.5, 0.9999975, 0.999999125, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 10, 100
, 0.999999999999999, 0.999999999999999, 0.999999999999999]
cvec = [big"0.0389119302347013668966224771378", big"2.00057730485083154100907167685e-11", big"6.399062719389853083219914429e-13"
, big"1.06329748046874638058307112826e-7", big"0.00862507798552150713113488319155", big"0.0667418089782285927715589822405"
, big"0.1246894855262192" 
, big"0.04306469112078537", big"0.06674188216570097", big"0.0784681869930841", big"0.0792995047488726", big"0.06448860284750375", big"0.1066710629614485"
, big"0.1415806036539784", big"0.1510840430760184", big"0.07134663382271778", big"0.1201285306350883", big"0.1666128410939293", big"0.1847501847929859"
, big"0.07317273327500386", big"0.1237630544953746", big"0.1737438887583106", big"0.1951190307092811", big"0.07378938035365545"
, big"0.1249951430754052", big"0.1761984774738108", big"0.1987772386442824", big"0.2340886964802671", big"0.2479460829231492", big"0.1246895548850744"
, big"0.1066710629614484", big"0.0750909978020473015080760056431386286348318447478899039422181015"]

mvbck = "\033[1A\033[58G"
mva = "\033[72G"
mve = "\033[90G"

warmup = owenteasy(0.0625,0.025)

for i in 1:size(hvec,1)
    if i == 1
        println("\t\tExecution Time","\033[58G","h",mva,"a",mve,"error\t\t","log10 error")
    end 
    h = hvec[i]
    a = avec[i]
    c = cvec[i]
    print(i,"\t")
    @time tx = owenteasy(h,a)
    err = Float64(round(tx - c,sigdigits=4))
    logerr = Float64(round(log10(abs(tx-c)),sigdigits=3))
    println(mvbck,h,mva,a,mve,err,"\t",logerr)
end

Here are the results I get from the above testing:

                 Execution Time                           h             a                 error          log10 error
1         0.000013 seconds (4 allocations: 1.469 KiB)    0.0625        0.25              -9.243e-20     -19.0
2         0.000007 seconds (4 allocations: 1.469 KiB)    6.5           0.4375            -1.521e-27     -26.8
3         0.000003 seconds (4 allocations: 1.469 KiB)    7.0           0.96875           1.365e-28      -27.9
4         0.000003 seconds (4 allocations: 1.469 KiB)    4.78125       0.0625            -3.432e-23     -22.5
5         0.000003 seconds (4 allocations: 1.469 KiB)    2.0           0.5               -6.239e-19     -18.2
6         0.000003 seconds (4 allocations: 1.469 KiB)    1.0           0.9999975         -2.549e-17     -16.6
7         0.000040 seconds (78 allocations: 1.688 KiB)   0.0625        0.999999125       -4.573e-17     -16.3
8         0.000003 seconds (4 allocations: 1.469 KiB)    1.0           0.5               -1.39e-17      -16.9
9         0.000001 seconds (1 allocation: 16 bytes)      1.0           1.0               -1.467e-17     -16.8
10        0.000005 seconds (6 allocations: 1.500 KiB)    1.0           2.0               -1.749e-17     -16.8
11        0.000004 seconds (6 allocations: 1.500 KiB)    1.0           3.0               -4.593e-17     -16.3
12        0.000003 seconds (4 allocations: 1.469 KiB)    0.5           0.5               -1.017e-17     -17.0
13        0.000001 seconds (1 allocation: 16 bytes)      0.5           1.0               1.507e-17      -16.8
14        0.000005 seconds (6 allocations: 1.500 KiB)    0.5           2.0               -1.125e-16     -15.9
15        0.000003 seconds (6 allocations: 1.500 KiB)    0.5           3.0               -4.053e-17     -16.4
16        0.000003 seconds (4 allocations: 1.469 KiB)    0.25          0.5               -4.569e-18     -17.3
17        0.000001 seconds (1 allocation: 16 bytes)      0.25          1.0               -1.35e-18      -17.9
18        0.000003 seconds (6 allocations: 1.500 KiB)    0.25          2.0               1.313e-16      -15.9
19        0.000003 seconds (6 allocations: 1.500 KiB)    0.25          3.0               -4.461e-17     -16.4
20        0.000003 seconds (4 allocations: 1.469 KiB)    0.125         0.5               -1.717e-17     -16.8
21        0.000001 seconds (1 allocation: 16 bytes)      0.125         1.0               -2.796e-17     -16.6
22        0.000005 seconds (6 allocations: 1.500 KiB)    0.125         2.0               -6.153e-17     -16.2
23        0.000004 seconds (6 allocations: 1.500 KiB)    0.125         3.0               4.929e-18      -17.3
24        0.000003 seconds (4 allocations: 1.469 KiB)    0.0078125     0.5               3.781e-18      -17.4
25        0.000001 seconds (1 allocation: 16 bytes)      0.0078125     1.0               -1.026e-18     -18.0
26        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     2.0               1.615e-17      -16.8
27        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     3.0               2.795e-17      -16.6
28        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     10.0              7.214e-17      -16.1
29        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     100.0             -4.746e-17     -16.3
30        0.000027 seconds (78 allocations: 1.688 KiB)   0.0625        0.999999999999999 -4.486e-17     -16.3
31        0.000014 seconds (78 allocations: 1.688 KiB)   0.5           0.999999999999999 5.956e-17      -16.2
32        0.000013 seconds (78 allocations: 1.688 KiB)   0.9           0.999999999999999 -5.269e-18     -17.3

azev77 commented 3 years ago

@johnczito @andreasnoack @mschauer @stevengj This would be useful for Distributions.jl Including for the skewed Normal/T that I PRed…

stevengj commented 3 years ago

Whether using generic quadrature schemes is "sufficient" depends on the use case. It is typically orders of magnitude slower than specialized expansions like continued fractions, so you will probably be losing compared to other languages.

That being said, I agree that the optimal choice of algorithm these days is probably different than that old paper.

mschauer commented 3 years ago

Does it make sense to discuss this in the larger context of the package? Are our other implementations typically "as fast as reasonably can be" or is there a continuous work on adding correct but perhaps not optimal implementations and later performance improvements?

azev77 commented 3 years ago

@mschauer I agree. It looks like @blackeneth's implementation is already better than the current stuff that is used. Users can refer back here to Steven's comments & think about further optimizations in the future.

stevengj commented 3 years ago

True, but it's also nice to avoid "embarrassingly slow compared to R or Python", and it would be good to check whether the quadrature approach falls into this category.

blackeneth commented 3 years ago

After some micro-optimizations, the speed for calculating Owen's T when a ≤ 0.999999 is about 1 μs:

OwenT-legendregauss

When 0.999999 < a < 1.0, QuadGK is called and the execution time is about 7.5 μs: owent-quadrature

Worst-case error is about 1e-16:

owent-error

With some additional type stability work and test development, we should be good to go.

Note the function will have dependencies on:

LinearAlgebra
Distributions
QuadGK
IrrationalConstants
SpecialFunctions

devmotion commented 3 years ago

As discussed in https://github.com/JuliaStats/StatsFuns.jl/pull/114, we should not add a circular dependency on Distributions to StatsFuns. It's seems it is not actually needed and you could just call normcdf etc.

I am also worried that a dependency on QuadGK will make the package slower to load. In the past this has been a reason to not merge PRs (even though I think https://github.com/JuliaStats/StatsFuns.jl/pull/106 should be fine 😛 - it only adds a lightweight dependency and IMO the timings are increased only to a small extent).

azev77 commented 3 years ago

@blackeneth can you compare your functions performance w implementations in other languages?

blackeneth commented 3 years ago

I can replace the Distributions.jl call to Normal() with normcdf(x) = 0.5*erfc(-x*invsqrt2)

As for R & Python, I can't say I'm an expert at benchmarking those, but my initial checks have R at 4.74 μs and Python at 2.5 μs

I will look at using "T6" for 0.999999 < a < 1.0, which could eliminate the QuadGK dependency.

devmotion commented 3 years ago

I can replace the Distributions.jl call to Normal() with normcdf(x) = 0.5*erfc(-x*invsqrt2)

That's precisely what I was referring to since normcdf is already defined in StatsFuns (https://github.com/JuliaStats/StatsFuns.jl/blob/36e48bbbac12355906438ce1993736757a0c6568/src/distrs/norm.jl#L45).

blackeneth commented 3 years ago

Using method "T6" for a > 0.999999 has similar accuracy to using QuadGK:

#             time                                       h                   a            error   log10(abs(error))
30        0.000001 seconds (1 allocation: 16 bytes)      0.0625        0.999999999999999 1.362e-18      -17.9
31        0.000001 seconds (1 allocation: 16 bytes)      0.5           0.999999999999999 5.242e-18      -17.3
32        0.000001 seconds (1 allocation: 16 bytes)      0.9           0.999999999999999 8.609e-18      -17.1
33        0.000002 seconds (1 allocation: 16 bytes)      2.5           0.999999999999999 1.22e-17       -16.9
34        0.000002 seconds (1 allocation: 16 bytes)      7.33          0.999999999999999 2.688e-17      -16.6
35        0.000001 seconds (1 allocation: 16 bytes)      0.6           0.999999999999999 -1.346e-17     -16.9
36        0.000002 seconds (1 allocation: 16 bytes)      1.6           0.999999999999999 -5.124e-17     -16.3
37        0.000002 seconds (1 allocation: 16 bytes)      2.33          0.999999999999999 1.616e-18      -17.8
38        0.000007 seconds (4 allocations: 1.469 KiB)    2.33          0.99999           -2.807e-18     -17.6

The performance is quite good for 0.999999 < a < 1.0

owent_a999999

Some additional checks for type stability, then a few more tests, and it should be good for a PR.

blackeneth commented 2 years ago

The function has been ready for some time. The barrier is git. My VS Code is somehow setup wrong, so it wants to release the wrong files. I rarely use git, so I forget how to use it after every use. But now I'm moving, so won't have time to figure it out and release it for quite a while.

Attached is the code for the function, owent(h,a). Also attached is the code for the tests.

If someone wants to take them and release them, please do so.

owent.txt OwenT_tests.txt

JuliaStats / StatsFuns.jl

Feature request: Owen's t function #99