NOAA-FIMS / FIMS

The repository for development of FIMS
https://noaa-fims.github.io/FIMS/
GNU General Public License v3.0
11 stars 7 forks source link

[Feature]: timing support in FIMS #572

Open ChristineStawitz-NOAA opened 1 month ago

ChristineStawitz-NOAA commented 1 month ago

Is your feature request related to a problem? Please describe.

In the Yellowtail Flounder case study, @cmlegault calls out that calculating index and SSB calculations and weights at mid-year intervals is needed for consistency with the ASAP model

Describe the solution you would like.

Flexibility to use timing beyond doing all calculations at the start of the year

Describe alternatives you have considered

We could do somewhat fuzzy timesteps e.g. placing things within a quarter of the year rather than a user-specified fraction.

Statistical validity, if applicable

No response

Describe if this is needed for a management application

No response

Additional context

No response

iantaylor-NOAA commented 1 month ago

SS3 has lots of flexibility on timing, with options for seasons and subseasons and different settings for timing of surveys at a single point in the year and fisheries as potential continuous. See https://nmfs-ost.github.io/ss3-doc/SS330_User_Manual_release.html#subseasons-and-timing-of-events for some info.

Casal2 likewise appears to be quite flexible as illustrated in this figure from the Casal2 Age-based User Manual: https://github.com/NIWAFisheriesModelling/CASAL2/raw/master/Documentation/UserManual/CASAL2_Age.pdf image

However, I'm not yet convinced that the benefits of that flexibility offset the potential confusion associated with what those options do. For instance, how much does it help to allow 75% of mortality to occur prior to calculating the expected value of an index if the associated expected age or length composition doesn't align with that point in time? If you're also updating the expected numbers at age then you might as well just have a quarterly time step.

A simpler approach that we could consider would be as follows:

If there are studies supporting more complex timing setups, I'm happy to change my mind. We could also conduct new research if needed to inform these questions, but if that's the case, I would suggest that FIMS starts with simple on timing and adds complexity when it is clearly justified.

cmlegault commented 3 weeks ago

Here in the Northeast, we have a long history of using timing within the year to reflect changes in the population. This is probably due to the large changes in fishing mortality rate that have been seen within the time series of many assessments (order of magnitude changes). So we tend to focus on the timing of the surveys when fitting them and the timing of SSB when calculating it. We do use quarterly or half-year age-length keys when deriving the catch at age, but then combine the catch over the year instead of breaking it out into timesteps within the year. This probably has more to do with historical precedence than an actual argument that annual time steps are better than shorter time steps.

I would put it back to @iantaylor-NOAA , why not allow the timing of surveys and calculation of SSB to be at any user input time of year? It is an easy calculation, easy to explain, and addresses one source of potential bias when F varies considerably during the time series. I agree that adding the complexity of within-year time steps creates a lot of added burden on data compilation and model definition that may not be warranted.

msupernaw commented 3 weeks ago

I recall before we started coding anything in FIMS, we had a timing toy example. I'll see if I can dig it up, it may be useful here.

msupernaw commented 3 weeks ago

Not sure, but this https://github.com/NOAA-FIMS/m1-prototypes/blob/main/fims_indexing.cpp may have been the timing example from a few years ago.

iantaylor-NOAA commented 3 weeks ago

@cmlegault, I think you're suggesting that instead of beginning and middle of the year or time step (e.g. Jan 1 and July 1) for all the calculations as I suggested, the user could specify for example SSB calculated on Feb 1 and survey biomass on Aug 1 if that provides a better match to the biology and survey timing. That seems reasonable to me. We just need to be explicit about what calculations incorporate that timing and which, if any, don't. Or if we make all the calculations having flexible timing, it would be good to explore better the tradeoffs of calculating values at many times of the year vs keeping things simpler.

My concern is about both the potential computational burden (having to calculate and keep track of numbers-at-age and/or the distribution of length-at-age at many different points in the year) or the potential conceptual mismatch (having users believe that the input for survey timing accounts for growth within the year when in fact the same weight-at-age or length distribution is used regardless of the input value).

msupernaw commented 3 weeks ago

Good points Ian. Not sure if it helps, but in MAS we had scalar values like survey_fraction_year, catch_fraction_year, and a few others. We interpolated the empirical weight at age data to match the all necessary fractions of the year and carried out the calculations and used those results as the model timestep derived value. So the calculation was carried out at the fraction of the year and stored at the general model timestep index, if that makes any sense. Those fractions could all be different and any calculation using the weight at age data used the appropriate whole number age plus the fractional age amount which was stored in a hash table with a key equal to whole number age plus the fractional age. Of course, MAS also accepted fractional ages as input, but it was never tested that way except maybe the first age group being 0.1 rather than zero. This method seemed to work fine without the computational burden you mentioned above. I'm sure there are trade offs though, but this seemed to alleviate the need for any complicated bookkeeping.

cmlegault commented 3 weeks ago

Thanks @iantaylor-NOAA, your summary captures my point more clearly. We input multiple weight at age matrices to allow for the different timing of event during the year. This is similar to what @msupernaw suggested, but puts the onus on the user to determine the appropriate weights at age for the different timings. Having the user enter different weights at age matrices makes it quite clear that growth differences can/should be accounted for when using the different timings.

JonBrodziak commented 3 weeks ago

Seasonal differences in mean weights at age can be readily accounted for in annual models.

I suggest we move forward with a simple approach for M2, providing for an annual within-year mortality adjustment for observations. As suggested, this is not really that complicated to code.

More detailed observations for seasonal model structures could be developed for future milestones.

iantaylor-NOAA commented 3 weeks ago

Discussion during seaside chat today agreed that there would be value in doing some simulation analyses with existing platforms to explore the benefits and impacts of different options for timing and seasons. Existing platforms may not be able to provide useful information on the speed/efficiency of these features in FIMS, but they could show us the impacts of finer or courser timing of calculations for mortality, weight-at-age, or growth modeling.

Rick-Methot-NOAA commented 2 weeks ago

Detailed timing is most important when: (a) short duration, high F fishery happening shortly before a survey (b) model is using length data for a fast-growing fish so the time of year at which the model length-at-age is evaluated becomes very important to get the modes to line up. SS3 put a lot of coding into version 3.30 to achieve better capability for (b). This included introduction of a subseason concept so that the growth could be evaluated only when needed. With 2 subseasons, growth was evaluated at the beginning and midpoint of each season. With more subseasons there was potential for more growth evaluations, but only if there was a survey that occurred close to that point in time in that year. For example, 12 subseasons in an annual model would allow for calculation of growth every month, but would only do so if some survey had an observation that month. Even so, with time-varying growth interacting with detailed seasonal timing there will be high computation load. SS3 reads survey observation timing as year, real_month. Then interpolates that point in time to a particular season and subseason when it reads the input. This provides much more flexibility than reading year, season as the timing for a survey.

ChristineStawitz-NOAA commented 2 weeks ago

This simulation analyses could be a good project for the FIMS R&D contract hire. We need to decide before that is included if we want to make any progress on this in M2.

Rick-Methot-NOAA commented 2 weeks ago

Back to Ian's suggestion to just use more seasons. The downside of that is that it generally requires all data to be subdivided by those same seasons, which could be difficult for some of the fishery data.