[Test]: Write tests for FIMS features not covered in existing tests

ChristineStawitz-NOAA commented 11 months ago

I went through the requirements and realized we have some features of FIMS that should work in theory but may not have been explicitly tested:

[x] multiple surveys
[x] multiple fishing fleets
[ ] estimate sigma R
[ ] estimate steepness
[ ] estimate slope of maturity (The same logistic function for selectivity is tested, maybe OK)
[ ] estimate inflection point of maturity (The same logistic function for selectivity is tested, maybe OK)
[ ] estimate q
[ ] estimate obs error in index
[ ] estimate M
[ ] multiple populations

I don't know how often, if ever, some of these are estimated, but if they have a toggle to turn estimation on and off, it seems reasonable to test that they can be estimated

ChristineStawitz-NOAA commented 11 months ago

Feedback during seaside chat: it seems uncommon to estimate observation error unless it has a strong prior, sigma R is sometimes estimated but maybe not a best practice.

iantaylor-NOAA commented 11 months ago

Sorry I missed the seaside chat, but here are some additional thoughts on this.

These discussions could be moved to a new issue about estimating variances, as I don't see a need to add tests related to this stuff now.

On estimating sigmaR, Thorson et al. (2019): Perspective: Let’s simplify stock assessment by replacing tuning algorithms with statistics https://doi.org/10.1016/j.fishres.2018.02.005 says

Future iterations of stock-assessment models could efficiently implement the Laplace approximation by coding a stock-assessment model using either ADMB-RE (Skaug and Fournier, 2006) or TMB (Kristensen et al., 2016). Both can calculate the derivative of the Laplace approximation to the marginal log-likelihood with respect to variance parameter σR2, and this greatly improves the speed and reliability of parameter estimation. No rush to implement any of this quickly, but good to think about for the future.

On estimating index observation error, this is common practice in SS3 models and long ago replaced iterative tuning of index uncertainty. The User Manual describes the option as "estimate a parameter that will contain an additive constant to be added to the input standard deviation of the survey variability". I don't see anything special in the likelihood, just the estimated parameter being added here: https://github.com/nmfs-stock-synthesis/stock-synthesis/blob/main/SS_objfunc.tpl#L68. Again, this is a low priority for the moment, but I think useful to have in the plan as it's a key part of data weighting which we want to think about in general for FIMS.

ChristineStawitz-NOAA commented 11 months ago

Thanks @iantaylor-NOAA !

I am testing multi-fleet and multi-survey now, which are the only items in the checklist I want to test at this moment. Some other things like estimating steepness, q, and M seem important but slightly less so, and I will ask Bai about in January, and the variances alone I agree with turning into a separate M2 issue to track discussion.

Update: at least cursorily, I can create multiple fleets and surveys in the test script and get the right expected values!

Bai-Li-NOAA commented 10 months ago

Q and selectivity parameters were estimated in the model comparison project, and these can be easily added to the FIMS estimation tests.

@ChristineStawitz-NOAA, regarding multiple surveys and fishing fleets, are you thinking about a unit test or integration test? We can use the model comparison project OM to create multiple cases (we had 12 cases in the model comparison paper) and test most of the items on the to-do list, including multiple surveys and multiple fishing fleets. I'd be happy to help set up some of the tests.

ChristineStawitz-NOAA commented 10 months ago

Thanks @Bai-Li-NOAA ! I think either could work - I did verify locally we could set up multiple fleets and surveys and Chris also used multiple fleets in his case study. I'm not sure an integrated test would tell us anything on top of what a unit test would?

Another thing that came up in Cole's case study was age-varying M. We have it in both Chris and Cole's case study, but to my knowledge is not included in our test cases? Please let me know if I missed it!

Bai-Li-NOAA commented 10 months ago

I was thinking that unit tests could help us verify that we can set up multiple fleets, while integration tests would allow us to compare estimated parameters from multiple fleets with 'true' values from an OM.

And you are correct, age-varying M is not included in our current tests.

ChristineStawitz-NOAA commented 10 months ago

@Bai-Li-NOAA maybe move to parking lot?

Bai-Li-NOAA commented 10 months ago

Sounds good to me. Perhaps we can go over this issue and add some tests in the MQ after M2?

NOAA-FIMS / FIMS

[Test]: Write tests for FIMS features not covered in existing tests #531