fortran-lang / stdlib

Fortran Standard Library
https://stdlib.fortran-lang.org
MIT License
1.1k stars 170 forks source link

Proposal: gaussian quadrature #277

Closed hsnyder closed 3 years ago

hsnyder commented 3 years ago

Hi everyone.

Would there be interest in adding methods to compute Gaussian quadrature points and weights (plus possibly Gauss-Lobatto points and weights)? I'd be happy to offer the implementation if there is interest.

arjenmarkus commented 3 years ago

Sofar the library contains the trapezoid and Simpson rules only. More sophisticated rules are certainly welcome, I'd say.

Op di 22 dec. 2020 om 06:13 schreef Harris Snyder <notifications@github.com

:

Hi everyone.

Would there be interest in adding methods to compute Gaussian quadrature points and weights (plus possibly Gauss-Lobatto points and weights)? I'd be happy to offer the implementation if there is interest.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/277, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR7REA2IL4CCH7ACJGLSWATIJANCNFSM4VFEBWXA .

ivan-pi commented 3 years ago

I'd be certainly interested in these. In a previous reply https://github.com/fortran-lang/stdlib/issues/112#issuecomment-576351981 I suggested these might be more suitable for a separate package. On second thought, since many Fortran users are physicists and engineers, at least for the classic Gaussian rules it might be beneficial to provide them here. If nothing else, the feedback from other stdlib developers and page visitors can help design a tidy API.

I have some very old Gaussian quadrature routines dating back to 1966 available here: https://github.com/ivan-pi/stroud_quad. If I am not mistaken they are accurate up to 10-11 digits for most cases.

jvdp1 commented 3 years ago

Thank you for your proposition. I agree wth @arjenmarkus and @ivan-pi: it would be good to have them in stdlib ;)

hsnyder commented 3 years ago

Okay, sounds like the idea has some support. I'll wait another couple of days (holidays coming up anyway) and if there aren't any objections raised we can discuss an API

certik commented 3 years ago

@hsnyder do not wait. We want this in stdlib, as we agreed the scope of stdlib can be roughly as scipy, and thus it definitely includes quadrature rules. Go ahead and start proposing the API. See also my comment with an implementation accurate to full double precision: https://github.com/fortran-lang/stdlib/issues/112#issuecomment-575963239

certik commented 3 years ago

I think this is a duplicate of #112. But I'll keep it open, we can close both #277 and #112 once this is implemented in stdlib.

ivan-pi commented 3 years ago

The NAG and IMSL mathematical libraries both offer Gaussian quadrature weights:

hsnyder commented 3 years ago

Thanks for the input everyone. Lots of options re: implementation it seems (depending on licenses)

Regarding API, I would strongly prefer returning the points and weights directly and having the user use sum(f(x)*w) rather than having the user pass in a function and having the library perform the integration.

One idea regarding the API: something virtually identical to what @certik has here: https://github.com/certik/hfsolver/blob/b4c50c1979fb7e468b1852b144ba756f5a51788d/src/quadrature.f90 i.e. function gauss_pts(n) and function gauss_weights(n)

but since you'll always want both, I could see the argument for a subroutine subroutine gauss_xw(n,x,w), where x and w would be intent(out)

We should provide analagous functions for lobatto points and weights, as well, and perhaps others. Thoughts?

jvdp1 commented 3 years ago

I am not a typical user for such functions. However, I suggest to use a similar API to trapz and trapz_weights

certik commented 3 years ago

@hsnyder yes, we can lump them together to return points and weights at once. The same for lobatto. Indeed, returning them to the user is the way to go.

hsnyder commented 3 years ago

@jvdp1 I think that since the use case for gaussian quadrature is a bit different (usually you're integrating a function rather than discrete samples), it makes more sense to return the points and weights instead.

@certik sounds good, I like your tabulated implementation from hfsolver. I propose to use that as the basis for an implementation, and fall back on another implementation in the n > 64 case. From what various people have posted in this thread and in #112, it seems like there are several options. Is there a favourite among them? Quadpack perhaps?

hsnyder commented 3 years ago

I've started working on an implementation... Should we provide separate implementations for getting the points and weights in real32, or support double precision only and let the user convert if they care? If we do want to offer real32 versions, is it important that the whole implementation be real32, or can the computation be done in 64 and then down-converted?

I'd be surprised if there's a lot of use of 32 bit platforms anymore, but figured I'd ask.

arjenmarkus commented 3 years ago

Wrt 32 bits platforms: probably not that much indeed, but that is a different matter than 32 bits versus 64 bits reals ;).

For the sort of problems I work on professionally, 32 bits (or, I prefer to talk of single precision, the two are not entirely equivalent), offer in general quite enough precision. Single-precision means 5 or 6 decimals. If you consider such quantities as the water depth (to borrow from my field), that would typically mean better than 1 mm precision (assuming water depths up to 1 km). That is far more precise than any measurements in that field. Using single-precision means a more economic use of memory and disk space. That is certainly not to say that double-precision does not enter my field of expertise - I merely mean that both types of real precision are still useful.

That said, I would say that for the purposes of this part of the standard library it suffices to have double-precision points and weights and have a convenient interface to convert to single-precision.

hsnyder commented 3 years ago

Hi @arjenmarkus

Good points - it was lazy of me to conflate 32 bit reals with 32 bit platforms. As per your suggestion, I'll provide interfaces for both kinds of reals, but the back-end computation of the nodes and weights will be done in 64 bit.

milancurcic commented 3 years ago

Should we provide separate implementations for getting the points and weights in real32, or support double precision only and let the user convert if they care? If we do want to offer real32 versions, is it important that the whole implementation be real32, or can the computation be done in 64 and then down-converted?

@hsnyder Sorry for the late chime in on this.

If feasible, ideal would be to have 32, 64, and 128-bit real specific procedures. i.e. if the implementation is the same for all three but only the interface changes, I think we should provide them.

But if the implementation is different for each, then I suggest pick one that's most useful for you (e.g. real64). If there are users who also want real32, they will chime in or contribute code.

I recommend against providing a real32 interface that is real64 under the hood, for two reasons:

  1. Users would think they're using a real32 procedure, but they wouldn't be getting the benefit of real32 because it's real64 under the hood. IMO this interface would be misleading.
  2. Users can convert to real32 themselves if they really need that.

I also tend to use real32 for the same reason as @arjenmarkus.

What do you think?

ivan-pi commented 3 years ago

I agree with the previous comments that single precision reals are still sufficient for many engineering applications.

I agree with @milancurcic only partially concerning the misleading interface. For Gaussian quadrature typically the user will only calculate the weights once at the beginning of the program, this calculation will normally be only a minor fraction of the total calculation cost.

Recently, Larry Young, a retired chemical engineer with interest in numerics, camshaft design, and also Fortran, wrote a very comprehensive monograph about collocation methods and Gaussian quadrature. A PDF version can be found at: https://www.tildentechnologies.com/Numerics/

He provides a through analysis of the accuracy and algorithmic efficiency of various methods for computing Gaussian quadrature weights. Quoting from section 2.4.8 on page 94:

Gauss or Lobatto points and weights with n= 100 can be calculated in only 25 or 60 μsec with recurrent and composite calculations, respectively, while a million points and weights can be calculated in less than 0.3 sec.

With nAsmpof 40 the switch from recurrent to composite method causes a factor of 3 to 4 increase in computation time, but the overall time is so small it hardly seems relevant, so the small improvement in accuracy seems justified.

(I believe this paragraph applies to double precision calculations on a Intel Core i7-7600u, 2.8 GHz)

The conditioning of the calculations is such that you are typically able to obtain at least 14 decimals (out of a range of 16 decimals for 64-bit reals). Assuming the conditioning issues remain similar for single precision reals, it would mean you only get 5-6 decimals for a range of 7. Especially, when using single precision Gaussian quadrature coefficients I think it would be crucial to have all the digits correct. This only seems possible if you compute them in higher precision, initialize them as known constants, or load them from a file. Given that most processors with supporting Fortran compilers provide double precision reals, I don't think this would really be a problem.

Having written this, I support your option # 2, users should convert to a single precision real themselves.

arjenmarkus commented 3 years ago

Option 2 has the benefit that people will be aware of the difference in precision. Yes ,that is worthwhile.

Op do 7 jan. 2021 om 14:23 schreef Ivan Pribec notifications@github.com:

I agree with the previous comments that single precision reals are still sufficient for many engineering applications.

I agree with @milancurcic https://github.com/milancurcic only partially concerning the misleading interface. For Gaussian quadrature typically the user will only calculate the weights once at the beginning of the program, this calculation will normally be only a minor fraction of the total calculation cost.

Recently, Larry Young, a retired chemical engineer with interest in numerics, camshaft design, and also Fortran, wrote a very comprehensive monograph about collocation methods and Gaussian quadrature. A PDF version can be found at: https://www.tildentechnologies.com/Numerics/

He provides a through analysis of the accuracy and algorithmic efficiency of various methods for computing Gaussian quadrature weights. Quoting from section 2.4.8 on page 94:

Gauss or Lobatto points and weights with n= 100 can be calculated in only 25 or 60 μsec with recurrent and composite calculations, respectively, while a million points and weights can be calculated in less than 0.3 sec.

With nAsmpof 40 the switch from recurrent to composite method causes a factor of 3 to 4 increase in computation time, but the overall time is so small it hardly seems relevant, so the small improvement in accuracy seems justified.

(I believe this paragraph applies to double precision calculations on a Intel Core i7-7600u, 2.8 GHz)

The conditioning of the calculations is such that you are typically able to obtain at least 14 decimals (out of a range of 16 decimals for 64-reals). Assuming the conditioning issues remain similar for single precision reals, it would mean you only get 5-6 decimals for a range of 7. Especially, when using single precision Gaussian quadrature coefficients I think it would be crucial to have all the digits correct. This only seems possible if you compute them in higher precision, initialize them as known constants, or load them from a file. Given that most processors with supporting Fortran compilers provide double precision reals, I don't think this would really be a problem.

Having written this, I support your option # 2, users should convert to a single precision real themselves.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/277#issuecomment-756113359, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR4EGFDH3OJ34ZH6EGLSYWYTNANCNFSM4VFEBWXA .

hsnyder commented 3 years ago

Good points all around. I'm in agreement that we should provide a 64 bit version only, for all the reasons already discussed. I have a bit of work done on the implementation already, but looking at the workflow document it looks like I'm supposed to post the spec first? I'll write that as soon as I have a chance, and will post it for discussion.

I'll also look at the monograph linked by ivan-pi, sounds interesting.

Do we use a typical fork and PR workflow? Should I fork the stdlib, put my implementation in my copy, and then open a PR?

milancurcic commented 3 years ago

@hsnyder Great, thank you!

We don't need the full and final spec before opening the PR. What's important to discuss before the PR is the API of the function, i.e. procedure name, dummy arguments, their types, etc. We can develop the spec doc during the PR. There should be a complete spec done before the PR is merged. In summary:

  1. Discuss API in this thread and get consensus;
  2. Open a PR with implementation and spec draft.

Do we use a typical fork and PR workflow? Should I fork the stdlib, put my implementation in my copy, and then open a PR?

:+1:

hsnyder commented 3 years ago

Understood. Here are the subroutine signatures I propse:

    subroutine gausslegendre(x,w,interval)
        real(real64), intent(out) :: x(:), w(:)
        real(real64), optional, intent(in) :: interval(2) ! This defaults to [-1.0_real64, 1.0_real64]
    end subroutine

    subroutine gausslobatto(x,w,interval)
        real(real64), intent(out) :: x(:), w(:)
        real(real64), optional, intent(in) :: interval(2) ! This defaults to [-1.0_real64, 1.0_real64]
    end subroutine

x and w are the points and weights respectively. The size of the arrays passed is used to determine the number of points to be calculated (which is obvious, I suppose). The optional arguments a and b, if provided, will scale the points and weights for the interval [a,b] (the default would be [-1,1] ).

I'm not particularly married to the actual names, though these seem clear to me without being excessively verbose.

EDIT: updated function signatures in accordance with the below comments from @gareth-d-ga and @milancurcic

gareth-d-ga commented 3 years ago

You might consider making the domain of integration (interval(2)) an optional argument to gausslegendre_xw and gauslobatto_xw, with a default value of [-1.0_real64, 1.0_real64]. Then there is no need for shift_xw.

subroutine gausslegendre_xw(x, w, interval)
    real(real64), intent(out) :: x(:), w(:)
    real(real64), optional, intent(in) :: interval(2) ! This defaults to [-1.0_real64, 1.0_real64]
end subroutine

subroutine gausslobatto_xw(x, w, interval)
    real(real64), intent(out) :: x(:), w(:)
    real(real64), optional, intent(in) :: interval(2) ! This defaults to [-1.0_real64, 1.0_real64]
end subroutine
hsnyder commented 3 years ago

Excellent idea, I updated the proposed API in accordance with your suggestion.

milancurcic commented 3 years ago

In the naive interpretation of the name gausslegendre_xw, the _xw part suggests to me that this returns x and w (correct), but also that this serves to differentiate it from gausslegendre_somethingelse.

In other words, is there anything else that these quadratures could do or return? If not, then the _xw part is redundant. Can we do:

subroutine gausslegendre(x, w, interval)
    ...
subroutine gausslobatto(x, w, interval)
    ...

?

hsnyder commented 3 years ago

Another good point, I've updated the proposed API

hsnyder commented 3 years ago

Actually, @milancurcic, I'm having second thoughts about that... "gausslegendre" suggests the function has something to do with gauss-legendre quadrature, but "gausslegendre_xw" more clearly signifies that the function returns the points and weights. If it's too terse we could change the suffix to "_ptswts", but I prefer "_xw" personally.

Not that big a deal either way, but curious for your reaction.

milancurcic commented 3 years ago

You're right that "gausslegendre_xw" is more specific. But the arguments already document this.

Is there anything else that "gausslegendre" could be confused with? In other words, a user coming to the library and wanting the Gauss-Legendre quadrature--would they wonder what "gausslegendre" does assuming that they see the subroutine signature?

If some appendage is required, _xw is my favorite.

@arjenmarkus @jvdp1 @ivan-pi what do you think about the naming?

For reference, here's a short list of names elsewhere:

jvdp1 commented 3 years ago

I am not a typical user of such procedures. However, regarding the name, if gausslegendre cannot be confused with another procedure having a different objective, then I think that _xw is redundant because x and w are already in the API as arguments.

To add to @milancurcic 's list:

See also name examples of such F90 procedures here

arjenmarkus commented 3 years ago

I can imagine that another procedure in this context might be the direct application of the quadrature method to a given function. That would likely be called "integrate_gausslegendre" or something along those lines, though. Also the name of the encompassing module could serve as an indicator. I must admit that I did not immediately associate the "_xw" with the x-coordinates and weights ;). That said, I lean towards "gausslegendre" without the suffix (but definitely both names, not one of the abbreviations or omissions that are used in the Julia, Python or C examples).

Op ma 11 jan. 2021 om 20:41 schreef Jeremie Vandenplas < notifications@github.com>:

I am not a typical user of such procedures. However, regarding the name, if gausslegendre cannot be confused with another procedure having a different objective, then I think that _xw is redundant because x and w are already in the API as arguments.

To add to @milancurcic https://github.com/milancurcic 's list:

-

R: gaussLegendre https://www.rdocumentation.org/packages/pracma/versions/1.9.9/topics/gaussLegendre

Numpy: leggauss https://numpy.org/doc/stable/reference/generated/numpy.polynomial.legendre.leggauss.html

See also name examples of such F90 procedures here https://people.sc.fsu.edu/~jburkardt/f_src/quadrule/quadrule.html

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/277#issuecomment-758178487, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR74AODW7YRW6EY2RUTSZNH5NANCNFSM4VFEBWXA .

hsnyder commented 3 years ago

So I was working on the implementation for this and realized that under the hood there's an implementation of calulating the legendre polynomials and their derivatives. So I thought, why not expose it? The most natural function name for computing the legendre polynomial is, in my opinion, simply legendre. The derivative could be dlegendre.

Anyways, I propose adding these functions as well. Proposed API:

elemental real function legendre (N, x)
    integer :: N ! which legendre polynomial?
    real :: x ! the point to evaluate at
end function

elemental real function dlegendre (N, x)
    integer :: N ! which legendre polynomial?
    real :: x ! the point to evaluate at
end function

We could make these generic and supply implementations for real32 and real64 - really no reason not to in this case.


A seperate concern: I was thinking about the method name gausslobatto and to be honest it is ambiguous. For example, gauss-chebyshev-lobatto quadrature also exists. The correct term is gauss-legendre-lobatto quadrature. So I propose amending the names suggested above to gauss_legendre and gauss_legendre_lobatto. Is that okay?

Lastly, regarding @arjenmarkus's thought about offering integrate_gauss_legendre, I'm open to offering the implementation but I'll need someone else to propose the API - I'm still pretty new to fortran and don't understand much about passing procedures as function arguments.

ivan-pi commented 3 years ago

Anyways, I propose adding these functions as well. Proposed API:

real function legendre (N, x)
    integer :: N ! which legendre polynomial?
    real :: x ! the point to evaluate at
end function

real function dlegendre (N, x)
    integer :: N ! which legendre polynomial?
    real :: x ! the point to evaluate at
end function

We could make these generic and supply implementations for real32 and real64 - really no reason not to in this case. We can't make these functions elemental, but we could supply versions for scalar x and for arrays of any valid rank.

I believe this is something we want and has been mentioned previously in issue #179. See page 15 of this former WG5 proposal: https://wg5-fortran.org/N1651-N1700/N1688.pdf

I can't see any reason why this could not be an elemental function. If the arguments are scalars, then as long as you put the elemental attribute in front, it should just work. (The use of an elemental procedure does not fit well with the way the polynomials are actually calculated, using the recurrence relations.) I guess the actual challenge is writing a bullet-proof implementation which works accurately for any N or x.

hsnyder commented 3 years ago

I had thought that only single-argument functions could be elemental but you're right of course, we can make them elemental. I have an implementation that is iterative rather than recursive. I believe it will work. I've edited my above post to reflect the use of elemental.

Should these polynomial functions be in stdlib_quadrature, or in another module? stdlib_math or stdlib_functions maybe?

bcornille commented 3 years ago

A common pattern for evaluating Legendre polynomials is to request the result of all polynomials up to a certain order. For this case it would be beneficial to have an interface like

subroutine legendre(N, x, p)
    integer :: N ! highest legendre polynomial degree
    real :: x ! the point to evaluate
    real :: p(0:N)
end subroutine

The downside of this interface is of course it is not a function and it cannot be marked elemental. It could be made into a GENERIC interface for single and multi-point evaluation as well as array or scalar output. I have implemented some classes for Gauss-Legendre and Labatto-Legendre cardinal function evaluation that use this type of interface. They also store the reference interval nodes and weights, which could be useful for reuse in quadrature over many segments. An interface for requesting nodes and weights for arbitrary intervals is provided. Implementing gaussian quadrature as a class may not be right for stdlib, but it would allow efficient re-use of a single quadrature scheme over many intervals.

bcornille commented 3 years ago

I think it's true that you can't mix subroutines and functions. The generic would have to be something like

interface legendre
    module subroutine legendre_scalar(N, x, p)
        integer :: N
        real :: x
        real :: p
    end subroutine
    module subroutine legendre_array(N, x, p)
        integer :: N
        real :: x
        real :: p(0:N)
   end subroutine
end interface

I know this would not be compatible with your currently proposed API. Two more overloads for x(:) input could also be added. The upside here is flexibility, but the downside is of course the scalar cases have a poorer API than the functions.

hsnyder commented 3 years ago

I think it's true that you can't mix subroutines and functions.

Sorry, I deleted the post that @bcornille was responding to here, because I realized it was based on a misread of his first post.

Hmmm, I do prefer the function interface for the scalar/elemental case... What about including the function versions I proposed above, and, seperately:

subroutine legendre_upto (N, x, p)
    integer :: N ! highest legendre polynomial degree
    real :: x ! the point to evaluate
    real :: p(0:N)
end subroutine

We can change the suffix to something else (upto isn't the best), but my point is what about using a different procedure name to offer that interface as well?

bcornille commented 3 years ago

@hsnyder That seems like a perfectly reasonable solution as well. I'm new to this forum so I would open up the discussion and utility of this additional function to others.

I do think there may be some value in the use of a class for the Gaussian quadrature scheme since the calculation of nodes and weights of a certain order could be expensive if redone for many intervals. The implementation of Gauss-Legendre and Labatto-Legendre cardinal functions I mentioned can be founde here. There is some additional setup done, but something similar could be done where the quadrature scheme is setup with a call to a type-bound procedure and then provide additional type-bound procedures to get the nodes, weights, or pass a function and an interval to do the integration. This may be much more OO-style than desired for stdlib.

hsnyder commented 3 years ago

I would strongly prefer a simple subroutine call that just gives me the nodes and weights. I agree that an object would be useful for some applications but I think we should provide the simpler interface and let the user build scaffolding around it if it's of use to them, rather than force them to deal with a more complex API when they just want the nodes and weights once.

But yes, let's see what the rest of the community thinks.

jvdp1 commented 3 years ago

I would strongly prefer a simple subroutine call that just gives me the nodes and weights. I agree that an object would be useful for some applications but I think we should provide the simpler interface and let the user build scaffolding around it if it's of use to them, rather than force them to deal with a more complex API when they just want the nodes and weights once.

In the past, the preference has been for simple procedures, over OOP. If OOP is desired by the community, it can be always built on top of the simpler procedures.

ivan-pi commented 3 years ago

There have been multiple similar discussions in other issues. Ultimately, the choice between a procedural or OO interface will be a user preference and so we could provide both. In some applications, there is good potential for code reuse meaning once the procedural interface for gausslegendre is available, we can build a derived type with members x and w and various methods (integration, interval selection, etc). (Of course you could also do it the other way round, the procedural interface uses the derived type internally.)

Personally, my interest in Gaussian quadrature is related to solving one-dimensional PDE's with the method of weighted residuals. A fully fledged package for this purpose (like the one in preparation by Tilden Technologies (TT) would cover all of the following functionalities (imaged borrowed from the TT website):

Screenshot_2021-01-18_16-46-27

On the other hand looking at the Scipy library, currently it only provides routines for Gaussian quadrature (fixed_quad), nodes and weights of such quadratures (roots_legendre) and evaluation of orthogonal polynomials (eval_legendre and legendre, where the latter returns a np.poly1d object containing which contains the coefficients of the polynomial in monomial form).

A search on GitHub for eval_legendre returns around 6000 code hits, the majority of which appear to be simply copies of the original scipy repository. In between you can find actual usage cases in electrostatic solvers, quantum mechanics, et cetera. If we could count the number of times they are used from Boost, Julia, R, MATLAB, and C, and all of the historical Fortran codes, I am sure we can conclude that evaluation and quadrature are used frequently enough to deserve inclusion in stdlib. On the other hand the full orthogonal polynomial tool set would probably be better pursued as a fpm project first (less need for review, more flexibility to experiment with API, etc.).

hsnyder commented 3 years ago

My use case is also weighted residual techniques and I'd love to see more functionality in the stdlib cater to that sort of thing. However, I'm also mindful of scope creep.. I really just want to get the quadrature functions in ASAP, and since I'm internally computing the legendre polys, I figured why not expose an API for that too. We can revisit other functionality related to PDE solvers in future proposals. @ivan-pi I'd be happy to work on the extended functionality with you in the future if you want.

ivan-pi commented 3 years ago

I agree we should not pursue any advanced targets at the moment. I am looking forward to reviewing your pull request :+1:.

Hopefully, the procedural interfaces will open a door to other polynomials and quadratures and get more people involved.

hsnyder commented 3 years ago

sorry for the delay everyone, I have opened a preliminary pull request. looking forward to discussing.

ivan-pi commented 3 years ago

I have a comment regarding the implementation --- what is the accuracy? I had bad experience with higher order (say p=64) that wasn't accurate to machine accuracy with such methods. So we resorted to calculate the weights and points using high precision and generate code that hard codes the numbers valid to all digits. But this comment is an implementation, which is minor.

@certik perhaps we can discuss the issue here further. Some experiments by Larry Young (available here, page 242) show the weight errors calculated in 64-bit floating point precision by six different formulas (some using the polynomial recurrence relations). The errors grow approximately as n**2:

image

Do you think the reference stdlib implementation should guarantee full machine precision? In that case up to which n value would you say hard-coding the constants is a reasonable solution? In principle one can do the calculation in higher-precision internally. The drawback is this increases the computational cost, and might not work across different compilers.

The Standard documents typically contain sentences such as: "The value of the result is a processor-dependent approximation...", freeing the (library) developers to do what they see fit. Should the stdlib specification documents follow a similar approach?

hsnyder commented 3 years ago

For what it's worth, I have the same concern about accuracy. but since users might expect this function to be fast, I didn't want to use a quad precision implementation. I did compare this result to the pre calculated values that @certik linked to, and the results differed by at most ~1.1e-16 at n=64, with most of the values being exactly the same.

nevertheless I do think this is a valid concern, and I'm curious for other people's thoughts on how best to address it.

certik commented 3 years ago

So if the results only differ at 1e-16 level, then I am fine with calculating them. My experience with similar calculations was much worse, on the order of 1e-13 or even less, consistent with the graph that @ivan-pi posted. The issue is that there are errors in the finite element where this is used, and so if even the quadrature is already at 1e-13, the overall error is even higher, and that was not acceptable for us.

As to how high to go, 64 seems to be enough for double precision for all my applications. We can even tabulate up to 64 and compute beyond that. Or if the implementation really is 1e-16 level accurate, we can simply calculate all.

bcornille commented 3 years ago

It is probably unnecessary for typical use cases, but there are O(1) methods of calculating the nodes and weights that exist. This avoids the Newton iteration, but does seem to use a large table of tabulated values. FWIW I use the typical Newton method for my work as well.

hsnyder commented 3 years ago

I am definitely open to tabulating, but here is a comparison between your tabulated values, and this implementation for N=64. the first column is just the index into the x and w arrays, the second is the difference in x, and the final column is the difference in w.

           1   0.0000000000000000       -1.9277114626792269E-016
           2   0.0000000000000000       -2.5153490401663703E-017
           3   0.0000000000000000        6.5052130349130266E-017
           4   0.0000000000000000       -1.4571677198205180E-016
           5   0.0000000000000000        1.0408340855860843E-016
           6  -1.1102230246251565E-016   1.3877787807814457E-017
           7   0.0000000000000000        1.2143064331837650E-016
           8   0.0000000000000000        1.2490009027033011E-016
           9   0.0000000000000000        8.6736173798840355E-017
          10   0.0000000000000000        1.7347234759768071E-017
          11   0.0000000000000000        0.0000000000000000     
          12   0.0000000000000000       -1.5959455978986625E-016
          13   0.0000000000000000        6.9388939039072284E-017
          14   0.0000000000000000       -2.0816681711721685E-017
          15   0.0000000000000000        4.1633363423443370E-017
          16   0.0000000000000000        6.2450045135165055E-017
          17   0.0000000000000000        8.3266726846886741E-017
          18   0.0000000000000000        1.2490009027033011E-016
          19   0.0000000000000000        3.4694469519536142E-017
          20   0.0000000000000000        6.9388939039072284E-018
          21   0.0000000000000000        2.0816681711721685E-017
          22   0.0000000000000000        9.7144514654701197E-017
          23   0.0000000000000000       -5.5511151231257827E-017
          24   0.0000000000000000        1.7347234759768071E-016
          25   0.0000000000000000        6.9388939039072284E-017
          26   0.0000000000000000        1.0408340855860843E-016
          27   0.0000000000000000        6.9388939039072284E-017
          28   0.0000000000000000        6.9388939039072284E-017
          29   0.0000000000000000        5.5511151231257827E-017
          30   1.3877787807814457E-017   1.3877787807814457E-017
          31   0.0000000000000000        5.5511151231257827E-017
          32  -1.0408340855860843E-017   6.9388939039072284E-018
          33   3.4694469519536142E-018   0.0000000000000000     
          34   0.0000000000000000        6.9388939039072284E-018
          35  -1.3877787807814457E-017   2.7755575615628914E-017
          36  -2.7755575615628914E-017   5.5511151231257827E-017
          37   0.0000000000000000        6.9388939039072284E-017
          38   0.0000000000000000        9.0205620750793969E-017
          39   0.0000000000000000        1.0408340855860843E-016
          40   0.0000000000000000        6.9388939039072284E-017
          41   0.0000000000000000        9.7144514654701197E-017
          42   0.0000000000000000       -5.5511151231257827E-017
          43   0.0000000000000000        9.7144514654701197E-017
          44   0.0000000000000000        2.0816681711721685E-017
          45   0.0000000000000000        6.9388939039072284E-018
          46   0.0000000000000000        3.4694469519536142E-017
          47   0.0000000000000000        1.2490009027033011E-016
          48   0.0000000000000000        8.3266726846886741E-017
          49   0.0000000000000000        1.0408340855860843E-016
          50   0.0000000000000000        4.1633363423443370E-017
          51   0.0000000000000000       -2.0816681711721685E-017
          52   0.0000000000000000        6.9388939039072284E-017
          53   1.1102230246251565E-016  -4.8572257327350599E-017
          54   1.1102230246251565E-016   6.9388939039072284E-018
          55   0.0000000000000000        1.7347234759768071E-017
          56   0.0000000000000000        8.6736173798840355E-017
          57   0.0000000000000000        1.2490009027033011E-016
          58   0.0000000000000000        4.8572257327350599E-017
          59   1.1102230246251565E-016   1.3877787807814457E-017
          60   0.0000000000000000        1.0408340855860843E-016
          61   0.0000000000000000       -1.4571677198205180E-016
          62   0.0000000000000000        6.5052130349130266E-017
          63   0.0000000000000000       -2.5153490401663703E-017
          64   0.0000000000000000       -1.9277114626792269E-016
hsnyder commented 3 years ago

it wouldn't be particularly difficult to change the implementation to quad precision, and from there generate tabulated values

certik commented 3 years ago

Your implementation is incredibly accurate, this accuracy would be fine with me. Note that @ivan-pi's graph shows much lower accuracy than this. I think that must be due to you using the recursive formula, which (if I remember well from 10 years ago when I played with it) is more accurate.

certik commented 3 years ago

Are you showing absolute or relative errors? 2.22e-16 is machine epsilon, so it looks like you got all numbers converged to machine epsilon.

hsnyder commented 3 years ago

That is absolute error (just straight difference). In that case I guess we will leave it.

One area where maybe improvement is possible is the initial guess for the GL points. right now I am just using the Chebyshev points, but I am not sure how this is going to hold up at high N values