nimIntegrate - Githubissues

paciorek commented 12 months ago

This is WIP, for comments from @paul-vdb @perrydv and (if interested) @danielturek .

Based on initial work by Paul and Perry, this creates a new DSL keyword, nimIntegrate that behaves like R's integrate, calling out to Rqdag{s,i} for bounded and unbounded 1-d numerical integration.

Some things to consider:

[ ] This allows a user to pass additional params to the integrand via a required (but could be ignored) param = double(1) argument to nimIntegrate. The user then has to unpack information from this for use in integrand function (the integrand function can ignore the argument, but it is required that it have the argument). Not sure if we want to try to make this more user-friendly (or indeed how to do that).
[ ] The user must provided a vectorized integrand. Not sure if we want to relax this and make possible a non-vectorized that we then vectorize in NimIntegrateProblem::fn in nimIntegrate.cpp.
[ ] For the moment, we just return the estimated integrand. R's integrate returns an error estimate and indeed a full list with diagnostic info. We could handle this using functionality similar to nimOptim, in which case the user's nimbleFunction that uses nimIntegrate() would have to unpack the output list. Perhaps it's enough in this initial release to leave as is. [update 2023-12-18: we plan to start with returning a 2-vector of the value and the uncertainty and in future work allow the user to indicate whether to return just the result, the 2-vector, or a list where the list would also contain the string-typed result message.] [update 2024-01-03: actually might as well return 3-vec with ierr as the code for the message and tell the user the code mapping. To do this we will allocate a nimArray of length 3 in the nimIntegrate class, return that array and have sizeExpr be 3 in sizeIntegrate. We don't think there will be any issue with memory deallocation.]
[ ] We allocate temp working space for Rqdag{s,i} in the work and iwork arrays. This is done when the "problem" object is created, so presumably happens every time nimIntegrate is invoked. I'm not seeing a good way to create those working objects only once. I guess we could allow the user to create them in their nimbleFunction and pass them in. [update 2024-01-03: Perry suggested use of static class vectors so that the temp arrays are shared by all members of the class and we resize the array as needed. Likely leave this for future work.]
[ ] Perhaps use std vectors.
[ ] Testing, roxygen, and manual are all set up.

Some example code:

integrand <- nimbleFunction(
    run = function(x = double(1), theta = double(1)) {
        return(x*theta[1])
    returnType(double(1))
  }
)

foo <- nimbleFunction(
    run = function(theta = double(0), lower = double(0), upper = double(0)) {
        tmp = c(theta, 0)
    return(integrate(integrand, lower, upper, tmp))
    returnType(double())
  }
)

cfoo <- compileNimble(foo)

foo(3.1415927, 0,1)
cfoo(3.1415927, 0,1)

# With a model:

code <- nimbleCode({
    sigma ~ dunif(0,5)
    tau ~ dunif(0,5)
    mu0 ~ dnorm(0, sd = 100)
    for(i in 1:n) 
        y[i] ~ dintglmm(mu0,sigma,tau)
})

integrand <- nimbleFunction(
    run = function(mu = double(1), pars = double(1)) {
        result = exp(dnorm(pars[1], mu, sd = pars[3]) + dnorm(mu, pars[2], sd = pars[4]))
        return(result)
        returnType(double(1))
    }
)

dintglmm <- nimbleFunction(
    run = function(x = double(0), mu0 = double(0), sigma = double(0),
                   tau = double(0), log = integer(0, default = 0)) {
        returnType(double(0))
        pars <- c(x,mu0,sigma,tau)
        prob <- integrate(integrand, -100, 100, pars)
        if(log) return(log(prob)) else return(prob)
    },

    )

m <- nimbleModel(code, data = list(y = rnorm(5)), constants = list(n=5),
                 inits = list(mu0 = 0, sigma = 1, tau = 1))
cm <- compileNimble(m)
cm$calculate('y')

paciorek commented 10 months ago

Ok, I have an essentially full-featured version of nimIntegrate set up. (A bit ago I thought it might be more bare-bones, but it was pretty straightforward to get everything going.)

The output is a 3-vector containing the estimate, uncertainty and a result code. We could implement variations down the road, but I'm pretty happy with it as is.

@paul-vdb particularly given your experience with using numerical integration in models, would like any thoughts you have on the user interface and roxygen/manual. @perrydv let me know if you see any technical issues in terms of the C++.

paciorek commented 10 months ago

Actually, @perrydv one thing I'd like input on is the following. If we try to have the param argument be an expression rather than the name of a variable such as integrate(integrand, lower, upper, c(theta,0)), as seen in the last test in test-integrate.R, then we get this C++ compilation issue because integrate is expecting a nimArray as the param argument.

> printErrors()
using C++ compiler: ‘g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’
P_31_rcFun_R_GlobalEnv67.cpp: In function ‘NimArr<1, double> rcFun_R_GlobalEnv67(double, double, double)’:
P_31_rcFun_R_GlobalEnv67.cpp:51:22: error: no matching function for call to ‘nimIntegrate(NimArr<1, double> (&)(NimArr<1, double>&, NimArr<1, double>&), double&, double&, Eigen::CwiseNullaryOp<concatenate1Class<long int, std::vector<double>, double>, Eigen::Matrix<double, -1, -1> >, int, double, double, bool)’
   51 | output = nimIntegrate(rcFun_R_GlobalEnv66, ARG2_lower_, ARG3_upper_, nimCd((ConcatenateInterm_60)), 100, 0.0001220703125, 0.0001220703125, true);
      |          ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from P_31_rcFun_R_GlobalEnv67.h:11,
                 from P_31_rcFun_R_GlobalEnv67.cpp:13:
/tmp/nim-int3/nimble/include/nimble/nimIntegrate.h:97:19: note: candidate: ‘template<class Fn> NimArr<1, double> nimIntegrate(Fn, double, double, NimArr<1, double>&, int, double, double, bool)’
   97 | NimArr<1, double> nimIntegrate(
      |                   ^~~~~~~~~~~~
/tmp/nim-int3/nimble/include/nimble/nimIntegrate.h:97:19: note:   template argument deduction/substitution failed:
P_31_rcFun_R_GlobalEnv67.cpp:51:75: note:   cannot convert ‘concatenate_impl<Eigen::Matrix<double, -1, -1> >::concatenate<std::vector<double> >(ConcatenateInterm_60)’ (type ‘Eigen::CwiseNullaryOp<concatenate1Class<long int, std::vector<double>, double>, Eigen::Matrix<double, -1, -1> >’) to type ‘NimArr<1, double>&’
   51 | output = nimIntegrate(rcFun_R_GlobalEnv66, ARG2_lower_, ARG3_upper_, nimCd((ConcatenateInterm_60)), 100, 0.0001220703125, 0.0001220703125, true);

I'm not sure whether we want to error trap and tell users they need to manually 'lift' the expression for param, or actually allow this.

paul-vdb commented 10 months ago

@paciorek sorry for my lack of response on this thread. Of course, now that I want to have this function already in Nimble, I am more motivated to help out. I am happy to help out with some rOxygen. I actually have a very explicit example right now playing around with a generalized von mises distribution that uses R's integrate function. What do you need from me explicitly?

paul-vdb commented 10 months ago

@paciorek Played around a little with nimIntegrate and I love that it is in there. I made a generalized von Mises with it that is silly simple with the integrate function. Awesome work!

integrand <- nimbleFunction(
    run = function(x = double(1), theta = double(1)) {
        return( exp(theta[1] * cos(x) + theta[2] * cos(2 * (x + theta[3]))) )
        returnType(double(1))
})
dGenVonMises <- nimbleFunction(
  run = function(x = double(1), mu1 = double(), mu2 = double(), kappa1 = double(), kappa2 = double(), limits = double(1)){
    range <- limits[2] - limits[1]
    mu1R <- (mu1 - range/2)/range*2*pi
    mu2R <- (mu2 - range/2)/range*2*pi
    z <- (x - range/2)/range*2*pi
    d = (mu1R - mu2R) %% pi
    num <- exp(kappa1 * cos(z - mu1R) + kappa2 * cos(2 * (z - muR2)))
    tmp <- c(kappa1, kappa2, d)
    den <- nimIntegrate(integrand, lower = 0, upper = 2*pi, tmp)[1]
    dens <- num/den
    return(dens*2*pi/range)
    returnType(double(1))
  }
)

cgvonmises <- compileNimble(dGenVonMises)
x <- 0:360
# plot(x, circular::dgenvonmises(circular(x, type = "angles", units = "degrees"), -1, 1, 2, 1))
plot(x, cgvonmises(x, mu1=20, mu2=300, kappa1=2, kappa2=1, limits = c(0, 360)), type = 'l')

paciorek commented 9 months ago

Thanks @paul-vdb . I may try to make use of your example in the manual section in nimIntegrate. I don't think I needed anything explicit other than any feedback you might have.

paul-vdb commented 9 months ago

@paciorek The main thing I would highlight in the roxygen is that we have to pass all the parameters into the param vector of the integrand, unlike R which can you can call via

integrate(integrand, lower = -Inf, upper = Inf, mu = 0, sigma = 2)

where here they have can only have two arguments to the integrand. 1) a single dimension for integrating in vector format, 2) an object (vector, matrix, array...) containing all the parameters for integration.

nimIntegrate(integrand, lower = -Inf, upper = Inf, param = c(0,2))

I think when nimble functions with setup code are eventually allowed into model code (@perrydv which sounds very soonish?), then we can adapt some of these restrictions via defining constants separately in a method before calling nimIntegrate. Also, having the integrand in a method instead of a separate function will be great.

I think for a user interested in integration it totally meets the objective. I'll think of a more compelling NHPP model to fit to add to the manual to actually show people when it might be useful. Although so easily adding a generalized von Mises is very compelling to me! Note that the example I added doesn't match the circular package as their example is only a distribution for radians. I've added the Jacobian to make it the distribution for the original input (e.g. day of the year, degrees etc.) which is what I personally wanted it for.