kaskr / RTMB

R bindings to TMB
Other
48 stars 6 forks source link

base-R function coverage, error messages, etc. #33

Open bbolker opened 2 months ago

bbolker commented 2 months ago

(Apologies if this issue has too much in it, or if you'd rather discuss this on the Google Groups forum.)

In setting up some examples for an intro to RTMB, I kept getting this error:

Error: 'y' is not a valid 'advector' (constructed using illegal operation?)

It took me a while to track down that this was caused by a call to lbeta(), defined in R as .Internal(lbeta(a, b)) (so it makes sense that trying to use this function would break stuff if it weren't explicitly defined by (R)TMB — the AD machinery can't follow the computation into the C code ...)

Is there a way to diagnose this kind of problem/make the error message more informative?

More generally, I've been thinking about how to support a wider range of functions implemented at the TMB level (e.g. there are a bunch of distributions implemented in glmmTMB that would be nice to have (I ran into this problem when implementing a beta-binomial), as well Jan-Ole Koslik's request for a von Mises distribution implementation (although in many cases these can be implemented in R, just with a small loss in efficiency). Is there a way to include extra compiled function definitions (either user-specified, or in an additional TMB_extras package that you or someone could maintain without having to mess with TMB constantly) ?

(I think the other main benefit of being able to define functions at the TMB/C++ level is to introduce conditionals when they're sensible, e.g. in limiting cases that would otherwise break.)

kaskr commented 2 months ago

Is there a way to diagnose this kind of problem/make the error message more informative?

The advector class is currently defined by structure(complex(), class="advector"). An unfortunate side effect is that R thinks this is a normal complex number and then happily evaluates:

x <- structure(complex(), class="advector_test")
lbeta(x,x)

There's no simple fix AFAICT because on the C-side R checks isComplex(x) which is TRUE and there's no way to change that.

One workaround could be to wrap the current representation in a list:

x <- structure(list(complex()), class="advector_test2")
lbeta(x,x)

which gives the error 'non-numeric argument to mathematical function' at the right place. Or alternatively as S4 setClass("advector_test3", slots=list(x="complex")) with essentially the same effect. I think these (or other?) alternative representations might be worth looking into. But I'm worried about the overhead caused by the extra wrapping, especially in cases where we are taping a large number of scalar operations.

Is there a way to include extra compiled function definitions ...?

It's possible. Proof of concept here as a package linking to RTMB. However, I really think that RTMB should add a log.p argument to pnorm making the logit_pnorm atomic redundant.

And yes, a good way to handle conditionals is still missing in RTMB. It would be nice if e.g. logspace_gamma could be implemented in RTMB without writing C++ code.