kaskr / adcomp

AD computation with Template Model Builder (TMB)
Other
178 stars 81 forks source link

possibility for a warning when `if` is used inappropriately in TMB code #362

Open bbolker opened 2 years ago

bbolker commented 2 years ago

The fact that

You should not use if(x) statements where x is a PARAMETER, or is derived from a variable of type PARAMETER. (It is OK to use if on DATA types, however.) TMB will remove the if(x) statement, so the code will produce unexpected results.

is clearly documented here (and the reason is clear at least in hindsight if one thinks about differentiability).

Would it be technically feasible to issue a warning in this case?

(As a side note, is it worth mentioning CppAD::CondExpLt and friends for those cases where it actually makes sense to branch [e.g. to switch between two different representations which work better in different parameter regimes, but are continuous/have continuous derivatives at the switch point] ?)

kaskr commented 2 years ago

Comparison operators of AD types are handled by operator overloading, so pretty much anything you can imagine is technically possible. However, the challenge is to choose a good behavior, i.e. one that's helpful without being annoying.

The simplest option is to handle AD comparison by triggering a warning or error. This could be done at either compile time or run time, and a control switch could be added to disable the check. Unfortunately, this option turns out to be too strict because many 'black-box' algorithms from Eigen or elsewhere, which we know should be branching free (e.g. plain Cholesky factorization) are actually using comparisons anyway (for whatever reason). The TMB user would have no other choice than to switch off the check. Another concern with the simple option (compile time version) is that sometimes, e.g. during the sdreport phase of ADREPORTed quantities, it's perfectly OK to use comparisons because the AD tape is only needed for one set of parameter values.

A more sophisticated approach is to add special 'checking tags' to the AD tape to check that the same branches are taken on every function evaluation. If a check fails one can trigger a complete retaping. The obvious downside of this idea is the potentially huge performance hit caused by a full retaping. Ideally it should made easy for the TMB user to only retape the necessary subset affected by the branching. This is what I think is worth aiming for.

bbolker commented 2 years ago

Thanks for the detailed explanation; I can see that this falls in the 'uncanny valley' between "easy, let's just do it" and "undesirable or technically impossible" ...