NOAA-FIMS / FIMS

The repository for development of FIMS
https://noaa-fims.github.io/FIMS/
GNU General Public License v3.0
13 stars 8 forks source link

Investigate constrain deviations in recruitment #364

Open msupernaw opened 1 year ago

msupernaw commented 1 year ago

Describe the bug

When constrain deviations is set to true, the resulting gradient values are NaN.

To Reproduce

set recruitment deviations to true in recruitment_base.

Expected behavior

normal convergence

Screenshots

No response

Which OS are you seeing the problem on?

Mac

Which browser are you seeing the problem on?

No response

Which version of FIMS are you seeing the problem on?

all

Additional Context

No response

ChristineStawitz-NOAA commented 1 year ago

Thanks for flagging! Is constrained deviations the same as the sum to zero constraint? If so, we had a discussion about this at a meeting that I think you missed @msupernaw and concluded sum-to-zero is a bad practice that we should not support.

But we should keep the issue open and I will change it to remove the option if I have that right.

msupernaw commented 1 year ago

Ok, thanks for reminding me. I seem to remember something about that. Still, it should work though. I'm curious about what's causing the NaN's.

On Thu, May 25, 2023 at 12:24 PM Christine Stawitz - NOAA < @.***> wrote:

Thanks for flagging! Is constrained deviations the same as the sum to zero constraint? If so, we had a discussion about this at a meeting that I think you missed @msupernaw https://github.com/msupernaw and concluded sum-to-zero is a bad practice that we should not support.

But we should keep the issue open and I will change it to remove the option if I have that right.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-FIMS/FIMS/issues/364#issuecomment-1563185579, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFUSEDHLJ7LFAGUQINU6QTXH6BTXANCNFSM6AAAAAAYPBZSFY . You are receiving this because you were mentioned.Message ID: @.***>

-- Matthew Supernaw Scientific Software Developer National Oceanic and Atmospheric Administration Office Of Science and Technology NOAA Fisheries | U.S. Department of Commerce Phone 248 - 396 - 7797

ChristineStawitz-NOAA commented 1 year ago

@JonBrodziak comment:

Christine mentions some discussion among FIMS persons where constraints on recruitment deviations were concluded to be bad practice. I did not take part in this discussion.

What was the logic for this conclusion? It is completely wrong from a geometrical view. The constraint that a set of parameters sum to any number is a reduction in dimensionality from an n-D parameter subspace to an (n-1)-D parameter subspace. This reduction implies that one can calculate a mean value over the set of parameters in the subspace and represent each parameter in the subspace as a deviation from the mean. And one can always do this calculation if the mean of the parameters is a finite number, which it will always be with an assessment model.

The assertion that optimizing model parameters with a parameter constraint on recruitment deviations is a bad modeling practice is incorrect at a basic level of geometry. The constraint reduces the ambiguity of scale for the parameter subspace which is important for assessment models because one can always explain observed data with an unrealistically large population size that is observed with high variability.

Parameter constraints are important for estimability in high-dimensional models with limited observational data.

ChristineStawitz-NOAA commented 1 year ago

Bad practice is a poor choice of language on my part. This was discussed on 4/26 in the implementation team meeting and you can find the notes here. I pasted the relevant bit below.

Re:sum to zero constraint: @Cole-Monnahan-NOAA is against This was an unfinished ADMB issue that no one else has been able to resolve so its not clear we need it If we do use it we need to understand the issues posed for ADMB param_init_bounded_dev_vector has incorrect values when used with -mcmc option. · Issue #107 · admb-project/admb · GitHub @Rick-Methot-NOAA is exploring this in SS3. The original motivation was to make sure that the mean of the recdevs is consistent with the R0 parameter, but if you have 40 years of sigmaR penalizing differences from the mean so things will be close to summing to zero without this constraint. Good to have an answer for folks who are concerned about transitioning from models that do have this constraint. Consensus among us that we should not be adding it to FIMS at this time

Rick-Methot-NOAA commented 1 year ago

I am confident that the answer depends upon how far back in time the devs start before there is substantial data. A quick first test would be to re-run Bai's test data set using SS3 w and w/o the constraint. It's a simple choice of 1 or 2.

Andrea-Havron-NOAA commented 1 year ago

@msupernaw, I took a look and realized we're doing the sum to zero constraint on the recruit deviations before they're being logged (ie., when all values are all positive). This results in negative values being passed to the log function here, which is probably what is causing the NAs.

If we were to keep this option, the fix that changes the least amount of code would be to calculate the sum in log space and then back transform: this->recruit_deviations[i] -= exp(logsum / (this->recruit_deviations.size()));

JonBrodziak commented 1 year ago

One direct approach would be to include a Lagrange multiplier for the parameter constraint. I thought ADMB had adequately accounted for this with the definition of the object "init_bounded_dev_vector" but maybe not.

On Thu, May 25, 2023 at 8:41 AM Christine Stawitz - NOAA < @.***> wrote:

Bad practice is a poor choice of language on my part. This was discussed on 4/26 in the implementation team meeting and you can find the notes here https://docs.google.com/document/d/10nSfbPaBF2p7wL2cr5lW7PGZxlZI4tbPiwQRC8JKaXk/edit. I pasted the relevant bit below.

Re:sum to zero constraint: @Cole-Monnahan-NOAA https://github.com/Cole-Monnahan-NOAA is against This was an unfinished ADMB issue that no one else has been able to resolve so its not clear we need it If we do use it we need to understand the issues posed for ADMB param_init_bounded_dev_vector has incorrect values when used with -mcmc option. · Issue #107 · admb-project/admb · GitHub https://github.com/admb-project/admb/issues/107 @Rick-Methot-NOAA https://github.com/Rick-Methot-NOAA is exploring this in SS3. The original motivation was to make sure that the mean of the recdevs is consistent with the R0 parameter, but if you have 40 years of sigmaR penalizing differences from the mean so things will be close to summing to zero without this constraint. Good to have an answer for folks who are concerned about transitioning from models that do have this constraint. Consensus among us that we should not be adding it to FIMS at this time

— Reply to this email directly, view it on GitHub https://github.com/NOAA-FIMS/FIMS/issues/364#issuecomment-1563329399, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVXIZUKLTBINXGM7TKSJ2LXH6RT7ANCNFSM6AAAAAAYPBZSFY . You are receiving this because you were mentioned.Message ID: @.***>

-- Jon Brodziak, Ph.D. NOAA Inouye Regional Center Pacific Islands Fisheries Science Center 1845 Wasp Boulevard, Building 176, NMFS/PIFSC/FRMD Mail Room 2247 Honolulu, Hawaii 96818 USA Phone: 808-725-5617 Email: @.***

“Wherever my travels may lead, paradise is where I am.” ~ Voltaire

The views expressed in this message are my own and do not necessarily reflect any position of NOAA.

Bai-Li-NOAA commented 1 year ago

A quick first test would be to re-run Bai's test data set using SS3 w and w/o the constraint. It's a simple choice of 1 or 2.

Happy to re-run model comparison project and SS3 to test it.

Andrea-Havron-NOAA commented 1 year ago

We're about to do an in depth code review with the full implementation team. We can fix the bug in the constrained deviation calculation in the recruitment module during this time.

Cole-Monnahan-NOAA commented 1 year ago

FYI there are some existing issues/discussions about bounded_dev_vectors

https://github.com/nmfs-stock-synthesis/stock-synthesis/issues/29 https://github.com/nmfs-stock-synthesis/stock-synthesis/issues/37 https://github.com/NOAA-FIMS/FIMS/issues/173

My only two comments are (1) if we're going to use it we need to test/explore it more. This option is not used in many other software platforms (e.g., Stan) for random effect vectors, so I think it's our responsibility to do the due diligence. (2) It needs to work for optimization, Laplace approximation, MCMC, and simulation. It seems to work fine for optimization in ADMB but not in other contexts.

Rick-Methot-NOAA commented 1 year ago

The problem in ADMB is that it was inconsistently implemented for MCMC and MCEVAL. Consequently a draw that was accepted by MCMC could produce a weird result when displayed by MCEVAL

Andrea-Havron-NOAA commented 1 year ago

The underlying statistical issue at play here is identifiability. The sum-to-zero constraint is a frequentist approach to identifaibility (referred to as a hard constraint). It is considered a transformation that results in a changed interpretation of the parameter vector to be that of deviations around a mean, as @JonBrodziak noted above. This constraint requires reducing the parameter vector by 1 to get n-1 degrees of freedom, which is currently not implemented in FIMS ConstrainDeviatoins method.

Penalties, priors, and random effects are a hierarchical approach to identifiability (referred to as a soft constraint). Given that the recdevs in FIMS have a penalty, i.e. a distribution with a mean of 0, we are applying a constraint and can still interpret the parameter vector as a set of deviations around a mean.

I think the practice being questioned here is that of using both hard and soft constraints together to ensure identifiability in our models. If we look into this issue further, it would pair well with an exploration of methods to check model identifiability (e.g. data cloning).

Cole-Monnahan-NOAA commented 1 year ago

I don't think you can call it a hard constraint b/c it uses a penalty internally, and you can see this by looking at the mean of these vectors which will be very small but not identically 0. So it's more like a hard-ish constaint. It also is only available as a bounded parameter vector (there is no dev_vector type as far as I know) which can cause unexpected behavior when some devs are near the bounds. They'll exceed the bounds sometimes. I don't really understand how the penalty works when it's not included in the model NLL. It would be good for an interested party to try and recreate the constraint in the .tpl or .cpp so we can play with it a bit easier. Someone who actually wants this feature in FIMS should do the due diligence IMO, the default is to not include it right?

Andrea-Havron-NOAA commented 1 year ago

The terms, hard and soft, are how I've seen the methods referred to in the literature, but when used in isolation. I agree, once you introduce the penalty, the method can no longer be considered a hard constraint.

I think the best plan moving forward is to fix the bug in the ConstrainDeviations method I mentioned above and then remove the flag from the interface so the method can't be accessed when setting up a FIMS model.

Cole-Monnahan-NOAA commented 1 year ago

Agreed. I think having a placeholder for it for later purposes is a good idea and focus on standard approaches at the moment.

Andrea-Havron-NOAA commented 1 year ago

It looks like the constrain deviations boolean is not exposed to R under rcpp_interface.hpp. Should the title of this issue be updated to: Investigate constrain deviations in recruitment to preserve the conversation?