QuaLiKiz-group / QuaLiKiz

Other
1 stars 0 forks source link

Remove NAG dependency #27

Open Karel-van-de-Plassche opened 7 years ago

Karel-van-de-Plassche commented 7 years ago

We would like to remove our dependency on the proprietary NAG library. However, during early testing the CUBPACK library was deemed too slow. Maybe @jcitrin can post some more information.

jcitrin commented 7 years ago

We depend strongly on the d01fcf algorithm from NAG, which is a multidimensional adaptive integration routine using the Genz-Malik algorithm. CUBPACK (from Genz) is open source but found to be somehow less robust than d01fcf.

In any case, I think that it's best to write from scratch the integration routines, both 1D and 2D ones, using standard adaptive algorithms but customized to our particular integrands. The 2D integrals are of the form $e^{x^{-2}+x^{-2}}f(x,y)$, so in principle Gauss-Hermite could be the way forward. However, f(x,y) has resonant behaviour so the Gauss-Hermite routines I've already tried failed to converge.

I strongly feel that a custom integration routine would speed up QuaLiKiz, make it more robust, and detach us from the NAG dependency. This is a high priority action

aaronkho commented 7 years ago

Karel mentioned something regarding the non-deterministic nature of QLK outputs, and I found that mildly distrubing... So, since Karel suspected the issue to come from the integration algorithms, I had a look into the NAG functions used in QLK. Since I did find something of interest there, I'm putting the concern here.

It turns out the d01ahf subroutine, which evaluates the 1D integrals, applies a random (linear) transformation to the function before integrating. This random transformation is based on a pseudo-random number generator also within the NAG library, the function g05caf. This is done supposedly to "smooth out singularities", though I am uncertain how or why this would work if it is random. The code can be tested with a constant seed for the RNG, to check whether this is the sole cause for the non-deterministic behaviour, and if so, would give additional motivation to replace this algorithm with a deterministic one.

Some good news though, I found that the d01fcf subroutine, which evaluates the multidimensional integrals, is explicitly deterministic. So no problems there.

aaronkho commented 7 years ago

Heyo, I got bored. So I did some digging regarding this replacement issue.

I found something called the ACM TOMS library. In particular, TOMS algorithm #468 in this library supposedly (hearsay) does the same things as the NAG subroutine, D01AHF (adaptive 1D finite integrals). I did not check if it also relies on a RNG.

The algorithms can be found on the website http://www.netlib.org. The license for the ACM TOMS collection of algorithms can be found on the website http://www.acm.org/publications/policies/software-copyright-notice. It looks pretty open-source to me.

Citation: T. N. L. Patterson. 1973. Algorithm 468: algorithm for automatic numerical integration over a finite interval [D1]. Commun. ACM 16, 11 (November 1973), 694-699. DOI=http://dx.doi.org/10.1145/355611.362543


Also, a search of "genz" in the netlib repository yields TOMS algorithm #698, which has a coincidentally similar description (also hearsay) as the other NAG subroutine, D01FCF (multi-dimensional integrals over a

TRIGGER WARNING for Karel

hyper-rectangle).

Citation: Jarle Berntsen, Terje O. Espelid, and Alan Genz. 1991. Algorithm 698: DCUHRE: an adaptive multidemensional integration routine for a vector of integrals. ACM Trans. Math. Softw. 17, 4 (December 1991), 452-456. DOI: https://doi.org/10.1145/210232.210234


I could take a look into testing NAG and ACM TOMS for IO, numerical, and speed equivalency, but this would have to wait until I finish the first draft of my paper (likely late November). If someone has time before that, please feel free to have a stab.


Side Note

The ADAPT algorithm on Prof. Genz's website is effectively an open-source equivalent to the NAG subroutine, D01FCF. It is also written in FORTRAN, though I'm not sure if the input / outputs are in the same order, of the same requirements, etc.

Link: http://www.math.wsu.edu/faculty/genz/homepage

Citation of equivalency: Statistical Multiple Integration: Proceedings of a Joint Summer Research Conference Held at Humboldt University, June 17-23, 1989

I do not know if this particular algorithm is identical to the implementation within the CUBPACK tool, or slightly modified.

jcitrin commented 7 years ago

Thanks @aaronkho . The open source 1D adaptive integration looks promising. For the cubature (2D), of course having an open source version of d01fcf would be great since we know it works and then we will "de- NAG". However, this revisiting of the integration could also be an opportunity to try and make the algorithm more robust and faster as well. Perhaps key here is to include information on the resonance of the integrand in the routine (we know this analytically). This may be our problem with d01fcf, since we have a resonant integrand for whom not providing prior information can occasionally kill us with a global adaptive routine. See below, this looks promising in that regard:

http://www.feynarts.de/cuba/

We would want to use the "deterministic algorithms" which are faster. There are thus only 2 in the list in this package. The CUHRE algorithm is I think also the same as the NAG d01fcf (Genz algorithm). The Divonne algorithm looks very interesting since one can provide information about the integrand to speed up the convergence.

I had also planned on looking at it this month when some time frees up, so let's all keep each other updated if somebody starts working on it, to avoid inefficient parallel processing ;)

Karel-van-de-Plassche commented 6 years ago

I found something called the ACM TOMS library. In particular, TOMS algorithm #468 in this library supposedly (hearsay) does the same things as the NAG subroutine, D01AHF (adaptive 1D finite integrals). I did not check if it also relies on a RNG.

I took at stab at implementing ACM TOMS #468 in 0a52ce7be898c161d3d0d52ccf74e6d8dffd666f, but it seems to converge significantly less often than D01AHF. However, our variables are double precision and ACM TOMS #468 in single, so I might have dropped the ball in conversion. Keep you posted.