Open bbolker opened 2 years ago
Comparison operators of AD types are handled by operator overloading, so pretty much anything you can imagine is technically possible. However, the challenge is to choose a good behavior, i.e. one that's helpful without being annoying.
The simplest option is to handle AD comparison by triggering a warning or error. This could be done at either compile time or run time, and a control switch could be added to disable the check. Unfortunately, this option turns out to be too strict because many 'black-box' algorithms from Eigen or elsewhere, which we know should be branching free (e.g. plain Cholesky factorization) are actually using comparisons anyway (for whatever reason). The TMB user would have no other choice than to switch off the check. Another concern with the simple option (compile time version) is that sometimes, e.g. during the sdreport phase of ADREPORTed quantities, it's perfectly OK to use comparisons because the AD tape is only needed for one set of parameter values.
A more sophisticated approach is to add special 'checking tags' to the AD tape to check that the same branches are taken on every function evaluation. If a check fails one can trigger a complete retaping. The obvious downside of this idea is the potentially huge performance hit caused by a full retaping. Ideally it should made easy for the TMB user to only retape the necessary subset affected by the branching. This is what I think is worth aiming for.
Thanks for the detailed explanation; I can see that this falls in the 'uncanny valley' between "easy, let's just do it" and "undesirable or technically impossible" ...
The fact that
is clearly documented here (and the reason is clear at least in hindsight if one thinks about differentiability).
Would it be technically feasible to issue a warning in this case?
(As a side note, is it worth mentioning
CppAD::CondExpLt
and friends for those cases where it actually makes sense to branch [e.g. to switch between two different representations which work better in different parameter regimes, but are continuous/have continuous derivatives at the switch point] ?)