specification of likelihoods for composition data

iantaylor-NOAA commented 2 years ago

During 16 Feb 2022 meeting, @Cole-Monnahan-NOAA noted a difference in assumptions about additive constants added to composition data in multinomial likelihood to make it robust to the presence of zero observations in some bins (constant could be added to both expected and observed or just observed). @jimianelli-NOAA, suggested that neither approach is best.

In the hopes that a github issue like this is a reasonable place to come up with more detail on the specification than what's currently documented on this line: https://github.com/NOAA-FIMS/collaborative_workflow/blob/e939504add68ae62b1e60dede14b9187c2225885/035-model-specification.Rmd#L100

@jimianelli-NOAA please let us know what you think is the better approach. @Bai-Li-NOAA, I'm assuming this issue isn't present in the model comparison OM since the likelihoods are all on the EM side of that project, right?

Regardless of the approach chosen as best practice, would it be useful to include the option of alternative approaches to facilitate comparison with existing models?

Can we put Dirichlet-multinomial likelihood on the wishlist for Milestone 2? I see we have "Weighting and auto-weighting" on our requirements list: https://docs.google.com/spreadsheets/d/1_QRhYpzhRzzE4mon-934b8En0Xxz3JJfNZsd7UTZnxQ/edit#gid=0&range=C13 but I'm not sure how we plan to keep track of potential specifications that are not in Milestone 1.

Bai-Li-NOAA commented 2 years ago

@iantaylor-NOAA, you are correct. The model comparison project OM does not add constants to simulated composition data, that happens on the EM side. Based on my code snippets comparison back in 2019, I can confirm that the versions of BAM and AMAK I used added small numbers to both predicted and observed values.

@msupernaw probably knows more than me, maybe it is possible to add an option to the EM, so the EM will add a constant value if TRUE. For example, r4MAS would add sum to 0 constraint to recruitment deviations if users set constrained deviations = TRUE.

Appendix B of the software design specification - Milestone 1 might be a place to track the wishlist.

jimianelli-NOAA commented 2 years ago

Thanks for raising this from our discussion today. I've surely forgotten the rationale/arguments on the point of adding a constant to a multinomial.

The reason I said having no constant and one that's super small (1.e-15) may be poor choices (they are nearly the same thing) is that:

Multinomial likelihoods tend to lack robustness when sampling assumptions are violated
Fournier et al. (1990) (nicely summarised in Fisch et al. (2021 https://doi.org/10.1016/j.fishres.2021.106069) promoted a robust approach to composition data when deviations from assumptions occur (as an aside, the Fisch paper I think didn't really tested (via simulation) approaches in which distributional assumptions were violated, so if true, then not a fair test of likelihood approaches)
Adding a small constant to the proportions (e.g., 1.0e-3 or 1.0e-4 ) is intended to add robustness (think of it as avoiding paying too much attention to fitting exceedingly small proportions).

WRT FIMs, I think having options available to assessment scientists to evaluate is what matters most (as opposed to restricting that flexibility).

Cheers, Jim

On Wed, Feb 16, 2022 at 2:01 PM Ian Taylor @.***> wrote:

During 16 Feb 2022 meeting, @Cole-Monnahan-NOAA https://github.com/Cole-Monnahan-NOAA noted a difference in assumptions about additive constants added to composition data in multinomial likelihood to make it robust to the presence of zero observations in some bins (constant could be added to both expected and observed or just observed). @jimianelli-NOAA https://github.com/jimianelli-NOAA, suggested that neither approach is best.

In the hopes that a github issue like this is a reasonable place to come up with more detail on the specification than what's currently documented on this line:

https://github.com/NOAA-FIMS/collaborative_workflow/blob/e939504add68ae62b1e60dede14b9187c2225885/035-model-specification.Rmd#L100

@jimianelli-NOAA https://github.com/jimianelli-NOAA please let us know what you think is the better approach. @Bai-Li-NOAA https://github.com/Bai-Li-NOAA, I'm assuming this issue isn't present in the model comparison OM since the likelihoods are all on the EM side of that project, right?

Regardless of the approach chosen as best practice, would it be useful to include the option of alternative approaches to facilitate comparison with existing models?

Can we put Dirichlet-multinomial likelihood on the wishlist for Milestone 2? I see we have "Weighting and auto-weighting" on our requirements list: https://docs.google.com/spreadsheets/d/1_QRhYpzhRzzE4mon-934b8En0Xxz3JJfNZsd7UTZnxQ/edit#gid=0&range=C13 but I'm not sure how we plan to keep track of potential specifications that are not in Milestone 1.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-FIMS/collaborative_workflow/issues/49, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM2OFRJ6Q4CHRALK3RU46QLU3QNDPANCNFSM5OS4NQOQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

--

James Ianelli NMFS/NOAA Building 4 7600 Sand Pt Way NE Seattle WA 98115

206 526 6510

Visit the ADMB project http://admb-project.org/

Also see Alaska groundfish stock assessments https://www.fisheries.noaa.gov/alaska/population-assessments/north-pacific-groundfish-stock-assessment-and-fishery-evaluation

ChristineStawitz-NOAA commented 2 years ago

I'm tracking Dirichlet-multinomial on the wishlist in the FIMS repo; seems like we won't know until we try running Bai's OM code w FIMS as the EM whether we need a robust option to get convergence for M1 or if we can push it off to M2.

NOAA-FIMS / collaborative_workflow

specification of likelihoods for composition data #49