SciML / Integrals.jl

A common interface for quadrature and numerical integration for the SciML scientific machine learning organization
https://docs.sciml.ai/Integrals/stable/
MIT License
226 stars 30 forks source link

Gauss-Turán Quadratures #245

Open longemen3000 opened 8 months ago

longemen3000 commented 8 months ago

What kind of problems is it mostly used for? Please describe.

As most of those familiar with Gauss-Legendre quadratures, a one-dimensional integral can be aproximated to ∑(f(xᵢ)*wᵢ). where xᵢ and wᵢ depend on various factors. this equation notably uses just zero-order derivative information (just the function evaluated at the nodes). Gauss-Turán quadratures are an extension to this scheme, allowing the use of higher derivative information. This allows to higher precision with less nodes (given that you also have derivative information at the nodes).

Describe the algorithm you’d like The Gauss-Turán Quadrature is just expressed as a double sum:

∑wᵢ*∑(aᵢₖ∂ᵏf(xᵢ))

where ∂ᵏf(xᵢ) is just the k-th derivative evaluated at the i-th node, and wᵢ,aᵢₖ are just precomputed weights. The main algorithm needed is the one that calculates those weights and nodes, for one or more polynomial interpolants (like Legendre or Chebyshev).

Other implementations to know about

I haven't found other implementations online, but i did whip up an example with the weights and nodes in [1]. the code below is for a 5-point, k = 2 (value at the point, first and second derivative, the paper allows odd values only, so in the paper notation, it is 5-node, n = k/2 = 1 order quadrature)

xgt51 = [0.171573532582957e-02,
    0.516674690067835e-01,
    0.256596242621252e+00,
    0.614259881077552e+00,
    0.929575800557533e+00]

agt51 = [(0.121637123277610E-01,0.283102654629310E-04,0.214239866660517E-07),
(0.114788544658756E+00,0.141096832290629E-02,0.357587075122775E-04),
(0.296358604286158E+00,0.391442503354071E-02,0.677935112926019E-03),
(0.373459975331693E+00,-0.111299945126195E-02,0.139576858045244E-02),
(0.203229163395632E+00,-0.455530407798230E-02,0.226019273068948E-03)
]
function gt51(f,a,b)
    X = xgt51
    A = agt51
    A11,A21,A31 = A[1]
    #x1 = (y1 - a)/(b - a)
    y1 = X[1]*(b - a) + a
    f1,df1,d2f1 = f(y1)
    res = zero(f1)
    res += A11*f1 + A21*df1 + A31*d2f1
    for i in 2:5
        Ai1,Ai2,Ai3 = A[i]
        yi = X[i]*(b - a) + a
        fi,dfi,d2fi = f(yi)
        res += Ai1*fi + Ai2*dfi + Ai3*d2fi
    end
    return res*(b - a)
end

References

[1] Kovačević, M. A. (2000). Construction of generalized Gauss-Turán quadrature rules for systems of arbitrary functions. Computers & Mathematics with Applications (Oxford, England: 1987), 40(8–9), 895–909. doi:10.1016/s0898-1221(00)85001-4 [2] Gori, L., & Micchelli, C. (1996). On weight functions which admit explicit Gauss-Turán quadrature formulas. Mathematics of computation, 65(216), 1567-1581. link

lxvm commented 8 months ago

∑wᵢ*∑(aᵢₖ∂ᵏf(xᵢ))

Note that at a high level ∑wᵢ*∑(aᵢₖ∂ᵏf(xᵢ)) = ∑wᵢ* g(xᵢ) so this can be implemented as a meta-algorithm, e.g. GaussTuránRule(sensealg, n, k) == Integrals.ChangeOfVariables((f, domain) -> (gauss_turán_integrand(f, n, k, sensealg), domain), Integrals.QuadratureRule(gauss_turán_weights_nodes, n=n)) where some work would have to be done to implement gauss_turán_weights_nodes (which could simply tabulate wᵢ, xᵢ from the paper) and gauss_turán_integrand which would have to compute gradients of f with an AD backend selected from sensealg (e.g. ForwardDiff.jl) and values of aᵢₖ (which again could be tabulated). Here, Integrals.QuadratureRule is documented and in the public API and Integrals.ChangeOfVariables is internal and recently merged to the main branch, and not actually necessary for this case but a reference as another meta-algorithm.

This would be an interesting contribution for anyone interested in:

SouthEndMusic commented 4 months ago

So I went down quite the rabbit hole to implement an algorithm for computing Gauss-Turán quadratures 😄 Find my code here.

Here is the example from [1]:

using DoubleFloats

FT = Double64
n = 5
s = 1
a = FT(0.0)
b = FT(1.0)
derivs! = @eval @generate_derivs($(2s))

"""
    Example functions ϕⱼ from [1]
"""
function f(x::T, j::Int)::T where {T<:Number}
    pow = if j % 2 == 0
        j / 2 - 1
    else
        (j - 1) / 2 - 1 / 3
    end
    x^pow
end

deriv! = @eval @generate_derivs($(2s))
I = GaussTuran(
    f,
    deriv!,
    a,
    b,
    n,
    s;
    optimization_options = Optim.Options(;
        x_abstol = FT(1e-250),
        g_tol = FT(1e-250),
        show_trace = true,
        show_every = 100,
        iterations = 10_000,
    ),
)

# Integration error |I(fⱼ) - ₐ∫ᵇfⱼ(x)dx| first N functions fⱼ
max_int_error_upper =
    maximum(abs(I(x -> f(x, j)) - I.cache.rhs_upper[j]) for j = 1:I.cache.N)
@show max_int_error_upper
# max_int_error_upper = 3.0814879110195774e-32

# Integration error |I(fⱼ) - ₐ∫ᵇfⱼ(x)dx| last n functions fⱼ
max_int_error_lower = maximum(
    abs(I(x -> f(x, j)) - I.cache.rhs_lower[j-I.cache.N]) for
    j = (I.cache.N+1):(I.cache.N+I.cache.n)
)
@show max_int_error_lower
# max_int_error_lower = 2.8858134286698342e-30

# Example with eˣ
exp_error = abs(I(Base.exp) - (Base.exp(1) - 1))
@show exp_error;
# exp_error = -6.436621416953051e-18

The weights for the quadrature rule are not shown here but they are known (I.A). My results for the nodes differ from those in [1], but the integrals are correct so I claim that this is because I have achieved higher accuracy :) I am not familiar with the Integrals.jl API so I haven't incorporated that yet.

I didn't implement the construction algorithm from [1], but derived a loss function which I think is simpler given that we can use automatic differentiation these days.

SouthEndMusic commented 4 months ago

automating the calculation of higher-order derivatives of integrands

I struggled with that, I currently hardcode the derivatives of the integrands. I tried something like this, but that is a nightmare in terms of type stability.

ChrisRackauckas commented 4 months ago

That looks like a really good start!

lxvm commented 4 months ago

Indeed, it is great that you demonstrate that your nodes and weights get the same accuracy for the first 2(s+1)*n polynomials, which is something I do not get for the n=5,s=1 rule in [1], for which I observe a degradation in accuracy after the first 2n polynomials. I suppose the optimization problem is rather costly, so make sure to save your results! Packages like GeneralizedGauss.jl tabulate them as well.

SouthEndMusic commented 4 months ago

Regarding cost: I updated my code and the example to work with any AbstractFloat type. I learned that BigFloat makes for very slow code because it leads to a lot of allocation even for scalar operations. Obtaining the results shown above took only half a minute, and I'm sure there's room for optimization.

SouthEndMusic commented 4 months ago

I updated the code and the example above again. I added a macro for computing the first $n$ derivatives of a scalar function at once, maybe a nice feature for ForwardDiff.jl @ChrisRackauckas?

The only thing I don't understand is why initializing the optimization problem takes so long, more specifically computing the first Hessian.

ChrisRackauckas commented 4 months ago

The code needs to be shared to see, that @eval is very suspect though. Is there a reason to not use TaylorDiff for this part? ForwardDiff stacking has bad complexity.