nmfs-ost / ss3-source-code

The source code for Stock Synthesis (SS3).
https://nmfs-ost.github.io/ss3-website/
Creative Commons Zero v1.0 Universal
36 stars 16 forks source link

[Feature]: Add full likelihood components for likelihoods, random effects and priors/penalties #553

Open Cole-Monnahan-NOAA opened 7 months ago

Cole-Monnahan-NOAA commented 7 months ago

Describe the solution you would like.

I suggest that SS3 should modify all objective function components (i.e., data likelihoods, priors, and hierarchical penalties) to use the full log-densities statements. E.g. dnorm and dmultinom type statements should be used in place of norm2 which drops the constants (example here).

Describe alternatives you have considered

One could meticulously add constants back but that is not ideal and would make the code harder to read and be more bug prone.

Statistical validity, if applicable

The approaches are equivalent up to a constant in the penalized maximum likelihood approach. However, this change would have three benefits as I see it.

  1. If a user wants to estimate a process error (sigmaR, SD for time-varying quantities, etc.) then the SD in those statements are no longer constant and the log-density calculations are incorrect and the estimate will be biased. Converting to the R-style dnorm statements will allow for improved options for estimating process errors in e.g. a Bayesian context.
  2. Improved interpretation of the NLL value when using more than one data likelihood. If a user tries the same model with multinomial and the D-M and the constants are not added then it's not clear what the change in NLL means. The constants must be included to compare among likelihoods (e.g., Burnham and Anderson 2014).
  3. The other advantage is code readability and transparency. Most users will be more familiar with dnorm(0,x,SD) and how to interpret that, and it is more clear what is a data likelihood vs prior.

Describe if this is needed for a management application

No response

Additional context

I realize this would require a big change and break all the tests (presuming they check for changes in the NLL). But the inference should be identical for all penalized ML models.

Also see request by quang-huynh for proper lognormal prior #558. Which should be done as an added prior option and the documentation will need full explanation of the difference.

Rick-Methot-NOAA commented 7 months ago

This indeed would be a large undertaking. SS3 has been penalized likelihood from day 1. Jim Thorson also has asked for this change.

kellijohnson-NOAA commented 7 months ago

@Rick-Methot-NOAA with your permission, I would like to attempt to work on this once I have the hake stock assessment finished (mid-February 2024). I am sure that I will have lots of questions and need help but at least you wouldn't be tasked with doing the whole thing.

Rick-Methot-NOAA commented 7 months ago

Glad you asked because I almost assigned it to you as a co-lead myself. So, you are welcome to be fully involved. First step can be assembling a list of current code features that will be affected.

Cole-Monnahan-NOAA commented 7 months ago

I don't think any features will be affected unless I'm missing something? I'm happy to help where I can. It seems prudent to list the changes needed and then walk through them one at a time using the test suite to ensure no changes to parameter estimates.

Rick-Methot-NOAA commented 7 months ago

features affected: multinomial logL includes offset for logL of a perfect fit so final value approaches 0 from below some logL components have a -log(s) that can be invoked by the sd_offset setting etc.

Cole-Monnahan-NOAA commented 7 months ago

Great points Rick. Many of these are features of the objfun I don't understand so that'll be good to have some others put eyes on it and understand it better.

@kellijohnson-NOAA I'll wait for your lead on how to proceed. I consider this a priority but non-urgent issue, personally.