Open Cole-Monnahan-NOAA opened 10 months ago
This indeed would be a large undertaking. SS3 has been penalized likelihood from day 1. Jim Thorson also has asked for this change.
@Rick-Methot-NOAA with your permission, I would like to attempt to work on this once I have the hake stock assessment finished (mid-February 2024). I am sure that I will have lots of questions and need help but at least you wouldn't be tasked with doing the whole thing.
Glad you asked because I almost assigned it to you as a co-lead myself. So, you are welcome to be fully involved. First step can be assembling a list of current code features that will be affected.
I don't think any features will be affected unless I'm missing something? I'm happy to help where I can. It seems prudent to list the changes needed and then walk through them one at a time using the test suite to ensure no changes to parameter estimates.
features affected: multinomial logL includes offset for logL of a perfect fit so final value approaches 0 from below some logL components have a -log(s) that can be invoked by the sd_offset setting etc.
Great points Rick. Many of these are features of the objfun I don't understand so that'll be good to have some others put eyes on it and understand it better.
@kellijohnson-NOAA I'll wait for your lead on how to proceed. I consider this a priority but non-urgent issue, personally.
Describe the solution you would like.
I suggest that SS3 should modify all objective function components (i.e., data likelihoods, priors, and hierarchical penalties) to use the full log-densities statements. E.g.
dnorm
anddmultinom
type statements should be used in place ofnorm2
which drops the constants (example here).Describe alternatives you have considered
One could meticulously add constants back but that is not ideal and would make the code harder to read and be more bug prone.
Statistical validity, if applicable
The approaches are equivalent up to a constant in the penalized maximum likelihood approach. However, this change would have three benefits as I see it.
dnorm
statements will allow for improved options for estimating process errors in e.g. a Bayesian context.dnorm(0,x,SD)
and how to interpret that, and it is more clear what is a data likelihood vs prior.Describe if this is needed for a management application
No response
Additional context
I realize this would require a big change and break all the tests (presuming they check for changes in the NLL). But the inference should be identical for all penalized ML models.
Also see request by quang-huynh for proper lognormal prior #558. Which should be done as an added prior option and the documentation will need full explanation of the difference.