nimble-dev / nimble

The base NIMBLE package for R
http://R-nimble.org
BSD 3-Clause "New" or "Revised" License
158 stars 24 forks source link

nimIntegrate #1357

Closed paciorek closed 9 months ago

paciorek commented 12 months ago

This is WIP, for comments from @paul-vdb @perrydv and (if interested) @danielturek .

Based on initial work by Paul and Perry, this creates a new DSL keyword, nimIntegrate that behaves like R's integrate, calling out to Rqdag{s,i} for bounded and unbounded 1-d numerical integration.

Some things to consider:

Some example code:

integrand <- nimbleFunction(
    run = function(x = double(1), theta = double(1)) {
        return(x*theta[1])
    returnType(double(1))
  }
)

foo <- nimbleFunction(
    run = function(theta = double(0), lower = double(0), upper = double(0)) {
        tmp = c(theta, 0)
    return(integrate(integrand, lower, upper, tmp))
    returnType(double())
  }
)

cfoo <- compileNimble(foo)

foo(3.1415927, 0,1)
cfoo(3.1415927, 0,1)

# With a model:

code <- nimbleCode({
    sigma ~ dunif(0,5)
    tau ~ dunif(0,5)
    mu0 ~ dnorm(0, sd = 100)
    for(i in 1:n) 
        y[i] ~ dintglmm(mu0,sigma,tau)
})

integrand <- nimbleFunction(
    run = function(mu = double(1), pars = double(1)) {
        result = exp(dnorm(pars[1], mu, sd = pars[3]) + dnorm(mu, pars[2], sd = pars[4]))
        return(result)
        returnType(double(1))
    }
)

dintglmm <- nimbleFunction(
    run = function(x = double(0), mu0 = double(0), sigma = double(0),
                   tau = double(0), log = integer(0, default = 0)) {
        returnType(double(0))
        pars <- c(x,mu0,sigma,tau)
        prob <- integrate(integrand, -100, 100, pars)
        if(log) return(log(prob)) else return(prob)
    },

    )

m <- nimbleModel(code, data = list(y = rnorm(5)), constants = list(n=5),
                 inits = list(mu0 = 0, sigma = 1, tau = 1))
cm <- compileNimble(m)
cm$calculate('y')
paciorek commented 10 months ago

Ok, I have an essentially full-featured version of nimIntegrate set up. (A bit ago I thought it might be more bare-bones, but it was pretty straightforward to get everything going.)

The output is a 3-vector containing the estimate, uncertainty and a result code. We could implement variations down the road, but I'm pretty happy with it as is.

@paul-vdb particularly given your experience with using numerical integration in models, would like any thoughts you have on the user interface and roxygen/manual. @perrydv let me know if you see any technical issues in terms of the C++.

paciorek commented 10 months ago

Actually, @perrydv one thing I'd like input on is the following. If we try to have the param argument be an expression rather than the name of a variable such as integrate(integrand, lower, upper, c(theta,0)), as seen in the last test in test-integrate.R, then we get this C++ compilation issue because integrate is expecting a nimArray as the param argument.

> printErrors()
using C++ compiler: ‘g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’
P_31_rcFun_R_GlobalEnv67.cpp: In function ‘NimArr<1, double> rcFun_R_GlobalEnv67(double, double, double)’:
P_31_rcFun_R_GlobalEnv67.cpp:51:22: error: no matching function for call to ‘nimIntegrate(NimArr<1, double> (&)(NimArr<1, double>&, NimArr<1, double>&), double&, double&, Eigen::CwiseNullaryOp<concatenate1Class<long int, std::vector<double>, double>, Eigen::Matrix<double, -1, -1> >, int, double, double, bool)’
   51 | output = nimIntegrate(rcFun_R_GlobalEnv66, ARG2_lower_, ARG3_upper_, nimCd((ConcatenateInterm_60)), 100, 0.0001220703125, 0.0001220703125, true);
      |          ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from P_31_rcFun_R_GlobalEnv67.h:11,
                 from P_31_rcFun_R_GlobalEnv67.cpp:13:
/tmp/nim-int3/nimble/include/nimble/nimIntegrate.h:97:19: note: candidate: ‘template<class Fn> NimArr<1, double> nimIntegrate(Fn, double, double, NimArr<1, double>&, int, double, double, bool)’
   97 | NimArr<1, double> nimIntegrate(
      |                   ^~~~~~~~~~~~
/tmp/nim-int3/nimble/include/nimble/nimIntegrate.h:97:19: note:   template argument deduction/substitution failed:
P_31_rcFun_R_GlobalEnv67.cpp:51:75: note:   cannot convert ‘concatenate_impl<Eigen::Matrix<double, -1, -1> >::concatenate<std::vector<double> >(ConcatenateInterm_60)’ (type ‘Eigen::CwiseNullaryOp<concatenate1Class<long int, std::vector<double>, double>, Eigen::Matrix<double, -1, -1> >’) to type ‘NimArr<1, double>&’
   51 | output = nimIntegrate(rcFun_R_GlobalEnv66, ARG2_lower_, ARG3_upper_, nimCd((ConcatenateInterm_60)), 100, 0.0001220703125, 0.0001220703125, true);

I'm not sure whether we want to error trap and tell users they need to manually 'lift' the expression for param, or actually allow this.

paul-vdb commented 10 months ago

@paciorek sorry for my lack of response on this thread. Of course, now that I want to have this function already in Nimble, I am more motivated to help out. I am happy to help out with some rOxygen. I actually have a very explicit example right now playing around with a generalized von mises distribution that uses R's integrate function. What do you need from me explicitly?

paul-vdb commented 10 months ago

@paciorek Played around a little with nimIntegrate and I love that it is in there. I made a generalized von Mises with it that is silly simple with the integrate function. Awesome work!

integrand <- nimbleFunction(
    run = function(x = double(1), theta = double(1)) {
        return( exp(theta[1] * cos(x) + theta[2] * cos(2 * (x + theta[3]))) )
        returnType(double(1))
})
dGenVonMises <- nimbleFunction(
  run = function(x = double(1), mu1 = double(), mu2 = double(), kappa1 = double(), kappa2 = double(), limits = double(1)){
    range <- limits[2] - limits[1]
    mu1R <- (mu1 - range/2)/range*2*pi
    mu2R <- (mu2 - range/2)/range*2*pi
    z <- (x - range/2)/range*2*pi
    d = (mu1R - mu2R) %% pi
    num <- exp(kappa1 * cos(z - mu1R) + kappa2 * cos(2 * (z - muR2)))
    tmp <- c(kappa1, kappa2, d)
    den <- nimIntegrate(integrand, lower = 0, upper = 2*pi, tmp)[1]
    dens <- num/den
    return(dens*2*pi/range)
    returnType(double(1))
  }
)

cgvonmises <- compileNimble(dGenVonMises)
x <- 0:360
# plot(x, circular::dgenvonmises(circular(x, type = "angles", units = "degrees"), -1, 1, 2, 1))
plot(x, cgvonmises(x, mu1=20, mu2=300, kappa1=2, kappa2=1, limits = c(0, 360)), type = 'l')
paciorek commented 9 months ago

Thanks @paul-vdb . I may try to make use of your example in the manual section in nimIntegrate. I don't think I needed anything explicit other than any feedback you might have.

paul-vdb commented 9 months ago

@paciorek The main thing I would highlight in the roxygen is that we have to pass all the parameters into the param vector of the integrand, unlike R which can you can call via

integrate(integrand, lower = -Inf, upper = Inf, mu = 0, sigma = 2)

where here they have can only have two arguments to the integrand. 1) a single dimension for integrating in vector format, 2) an object (vector, matrix, array...) containing all the parameters for integration.

nimIntegrate(integrand, lower = -Inf, upper = Inf, param = c(0,2))

I think when nimble functions with setup code are eventually allowed into model code (@perrydv which sounds very soonish?), then we can adapt some of these restrictions via defining constants separately in a method before calling nimIntegrate. Also, having the integrand in a method instead of a separate function will be great.

I think for a user interested in integration it totally meets the objective. I'll think of a more compelling NHPP model to fit to add to the manual to actually show people when it might be useful. Although so easily adding a generalized von Mises is very compelling to me! Note that the example I added doesn't match the circular package as their example is only a distribution for radians. I've added the Jacobian to make it the distribution for the original input (e.g. day of the year, degrees etc.) which is what I personally wanted it for.

perrydv commented 9 months ago

@paciorek I am starting to do a code review on this. One thing is can you check against #1378 (now merged into devel)? I think there, in fixing a nimOptim issue, you had modified exprClasses_setSizes with a special case used only by nimOptim, and now in this PR you are using the same approach for nimIntegrate. But I had revised your changes to #1378 in a way that I thought was more consistent with the system and less of a hot-wired special case, and my version of those little bits are what got merged into devel. So right now your nimIntegrate code would revert to the intermediate version you had in #1378 before I changed it. I think it would be better to use the later (now merged) approach for nimIntegrate as well. Let me know if you want me to make these changes.

perrydv commented 9 months ago

Alright I'm seeing a few other issues. It looks like the use of eval in keyword processing could cause a problem if the user has done rol.tel = my_tol, i.e. passed a local variable. And it looks like we could imitate sizeOptim to handle cases where f is either a method, a method of another nimbleFunction, or an RCfunction. I know we are just aiming for basics, but it seems like it's all there to imitate and maybe was just not clear or familiar. Seeing these, I will plan to make changes directly unless you'd prefer me not to.

paciorek commented 9 months ago

Yes, agreed about your improved handling in #1378. I had a note to myself to go back in and fix anything up on that front after merging with devel, as needed, but happy for you to do it now.

And happy for you to make the other suggested changes directly.

perrydv commented 9 months ago

@paciorek A couple of questions: It looks like R's integrate also returns the number of subdivisions use in the integration. It looks like we have access to that internally and could include it to make the return vector be length 4. Did you not want to do that or should we do that to be more complete? Also, the R-style lazy evaluation defaults of abs.tol = rel.tol can't work in compiled code, so I will make rel.tol default to the same as abs.tol's default. That does mean that changing rel.tol will not automatically change abs.tol by default.

paciorek commented 9 months ago

I believe I left out the number of subdivisions because it is controlled by the user so it didn't seem necessary to tell them what they put in.

As far as lazy eval, I was trying to mimic what we do with the args of the gamma distribution (scale and rate), and really thought I had done it correctly in the keyword processing. Can you say more about what the integrate keyword processing is doing wrong? That all happens at compile time, so aren't we free to be quite flexible about how we transform the code that a user provides?

perrydv commented 9 months ago

@paciorek Take a look at what I did here. A remaining use case that we could try to support would be using nimIntegrate directly in a model. But I suspect we are happy to defer that because it is easy enough to write a nimbleFunction to wrap a call to nimIntegrate, as you demonstrated in test-integrate. See test-integrate to walk through the changes in functionality that I made.

perrydv commented 9 months ago

I'm now catching comments in the thread that I didn't see before I pushed code.

BTW having param be an expression should now work and is shown in testing.

paciorek commented 9 months ago

Testing hung on test-ADlaplace, presumably unrelated to this PR. I'm merging and we'll deal with any actual issues on devel.