[Feature]: add ability to toggle between ADMB-style sum to zero constraint and k-1 style

ChristineStawitz-NOAA commented 2 years ago

Is your feature request related to a problem? Please describe.

The ADMB sum-to-zero constraint uses a penalty which is less efficient than implementing the kth element in the vector as the negative sum of the 1,..,k-1 elements, but is needed to ensure comparability between FIMS and an ADMB-based model. In the future it would be nice to be able to toggle between the two in FIMS.

Describe the solution you would like.

The ADMB-style will be implemented in m1 so in m2 would implement the k-1 approach and the ability to switch.

To implement ADMB-style sum-to-zero constraint with a penalty, we need to compile all ADMB code in debug mode. Then put a breakpoint at the set_value function for the bounded dev vector and when it stops there look at the calling stack to see what the code looks like which calls that function and what happens to the penalty (Thanks Dave Fournier for the suggestions!).

Describe alternatives you have considered

The options for fixed effects are ADMB-style & k-1 style. In the future we will also have the ability to specify random effects with a distribution centered on 0. Other great suggestions from the ADMB post

Statistical validity, if applicable

The Stan documentation describes this approach and its caveats -notably that this can cause the kth element to be distributed differently: https://mc-stan.org/docs/2_28/stan-users-guide/parameterizing-centered-vectors.html

Describe if this is needed for a management application

This will be useful to compare the two approaches to show to Councils

Additional context

No response

Craig44 commented 2 years ago

Edit I was getting confused between sum to one and sum to zero apologies.

Out of curiosity. I wonder if it is worth generalising this sum to zero constraint on the TMB side like in ADMB such as a SUM_TO_ZERO_PARAMETER_VECTOR. I recognise you will need to implement the ADMB penalised version for backwards compatibility but there are other parameter vectors that will want constraints. In particular vectors that are constrained to sum to one, which is a similar problem to sum to zero constraint . i.e. Recruitment allocation to the partition i.e. proportion male and proportion female and any movement parameters.

On a related note I was wondering if you could/have planned to add a general concept of parameter transformations which may also be able to deal with the above issue. For example often we estimate log (R0) instead of R0. Instead of hard coding log R0 as a parameter which I believe is done in SS (I could be wrong) allow users to choose a transformation for a parameter i.e. estimate R0 logged, or estimate R0 with a logistic transformation that is bounded etc. This could flow through to the vector parameter where users could transform/constrain parameter vectors such the admb sum to zero constraint or a sum to one constraint such as a simplex transformation as done in Stan or perhaps another constraint

Andrea-Havron-NOAA commented 2 years ago

Thanks for the input Craig! I've used the simplex transformation in previous TMB models with good success. I think the goal is to provide a generalized framework for parameter transformation, although this feature might not be fully developed in Milestone 1.

Bai-Li-NOAA commented 2 years ago

Thanks @nathanvaughan-NOAA for providing alternative formulation during a code review. To keep track of your suggestions, I moved them to this issue page.

Regarding the discussion of inherent limitations in the sum to zero dev vector question I have a suggestion that may be helpful or way off base. Laying out the logic of the current approach:

A single R0 is estimated for the entire model so a dev vector is required to allow annual deviations from the SR curve.

Sum to zero is required to avoid a 1:1 correlation between the deviation mean and R0.

Lognormal rec devs are used to constrain the model to positive recruitments and due to the zero mean requirement bias correction is needed otherwise this bias remains correlated with R0.

Internal subtraction of the mean is undesirable as it produces messed-up gradient calculations by trading the correlation with R0 for correlation among the rec dev vector.

To attempt to correct for this a likelihood penalty on the rec dev mean!=0 can be added instead but this does not ensure rec dev mean=0 so a subtraction is still required intermittently (such as at the end of an optimization phase).

When subtracting the mean it is possible to force devs outside of their bounds causing optimization/gradient errors. Alternative formulation:

Instead of fitting a global R0 and deviations we just estimate an independent R0 for every year (an initial equilibrium period could be considered its own year).

The global mean R0 is then a derived parameter from these independent annual R0 values.

No bias correction is required because there is no 1:1 correlation between parameters to remove.

No mean subtraction avoids gradient issues.

A sigma R parameter can still be used to restrain the annual variation around the derived mean R0.

A global prior could also be specified for R0 which would apply to the derived mean. I think the current setup is fine for this milestone and aids comparison with existing assessment models but this is my 2 cents for future discussion.

NOAA-FIMS / FIMS