kaskr / adcomp

AD computation with Template Model Builder (TMB)
Other
178 stars 81 forks source link

R-session crash during MakeADFun using TMBad but not using CppAD #365

Closed James-Thorson-NOAA closed 2 years ago

James-Thorson-NOAA commented 2 years ago

Description:

I am trying to switch VAST to using framework=TMBad (to use the faster epsilon bias-correction options), but one of my integrated tests crashes my R session when calling MakeADFun using framework="TMBad", but compiles (and gets the same answer as previously) when using framework = "CppAD".

Reproducible Steps:

Please see attached code: Replicate_problem--CLEAN.zip

Current Output:

Using TMBad it crashes R session with no output; using CppAD it creates the TMB object as expected.

Expected Output:

standard output from MakeADFun

TMB Version:

‘1.9.0’

R Version:

"R version 4.2.1 (2022-06-23 ucrt)"

Operating System:

Windows 10 Enterprise

PS

I realize that this is a high-level "integrated" test of a problem. Please tell me if you want me to comment out code, e.g., to identify what specific CPP code is associated with the problem.

kaskr commented 2 years ago

~It would be a great help if you could boil it down as much as possible.~ Example is fine as is.

kaskr commented 2 years ago

You've encountered a new feature of TMBad: the ability to detect undefined behaviour in your template.

Explanation

The following code:

Type a;
Type x = a*a;

is an example of undefined behaviour because a is not initialized. If you run it anyway you'll have a initialized to random junk that happens to be zero often - but not always! Fortunately, the compiler will warn you about it.

However, if you change the code to this:

vector<Type> a(1);
Type x = a[0]*a[0];

the compiler won't give you a warning (at least not on my system).

When compiled with CppAD you can run the code without noticing the bug. When complied with TMBad you will get a crash.

Running VAST example

When running your script I get this message (didn't you see it?):

TMBad assertion failed.
The following condition was not met: in_context_stack(data.glob)
Possible reason: Variable not initialized?
For more info run your program through a debugger.

I get the problematic line right away when running through the debugger:

#17 0x00007fffe49e441d in objective_function<TMBad::global::ad_aug>::operator() (this=0x7fffffff57b0) at VAST_v14_0_1_TMBad.cpp:2772

which is

D_i = R1_i * R2_i;   // Used in DHARMa residual plotting

Some element of either R1_i or R2_i was not calculated causing the uninitialized value to be used.

Perhaps you meant to zero initialize:

vector<Type> R1_i(n_i); R1_i.setZero();
vector<Type> R2_i(n_i); R2_i.setZero();

on the other hand it might be important to figure out why a R_i calculation was skipped?

James-Thorson-NOAA commented 2 years ago

Very cool! I can see that this is a helpful automated check.

I had tracked it to the D_i = R1_i * R2_i; by brute force (commenting out partitions of code and checking which would compile vs. crash), but I'm glad you identified what's happening.

I don't get the terminal message when running either compile (which works as normal) or MakeADFun (which crashes the R terminal before I can see any messages). I also have trouble with the debugger on my windows machine.

In case you're interested, I only compute R1_i and R2_i when the response !isNA(b_i(i)) (and only use the output in that same case), so that was causing the un-initialized values. But I see the point of avoiding passing un-initialized values, which users might find and mis-interpret. So I am adding R1_i(i) = NAN; R2_i(i) = NAN; whenever isNA(b_i(i))

Perhaps it's worth trying to change TMB to avoid crashing the R-terminal on Windows (or whatever context of my machine is causing it to happen), but I'm happy with this fix for my instance and closing the issue.