gadget-framework / gadget3

TMB-based gadget implemtation
GNU General Public License v2.0
8 stars 5 forks source link

Predation #29

Closed lentinj closed 3 months ago

lentinj commented 3 years ago

Implement predation as per gadget2: https://gadget-framework.github.io/gadget2/userguide/chap-stock.html#sec:stockpredator

Examples of gadget2 models to integration-test against

To-do list

vbartolino commented 3 years ago

a two species model with seal feeding on white fish is available in gadget2. I haven't tried migration to gadget3 for this one

lentinj commented 3 years ago

I meant the gadget2 model, I'll use it to validate that gadget3 is doing the right thing. Thanks!

lentinj commented 3 years ago

Predation will need separately named suitability variables, e.g. __fleetsuit and __predsuit, so gadget.fit() can treat them appropriately.

lentinj commented 2 years ago

An example two-species model: https://github.com/gadget-framework/gadget-models/tree/master/ven_rin_08.3

lentinj commented 6 months ago

@vbartolino @bthe I've been having a go at implementing predation yesterday, one thing not listed above is reporting. Currently we store, for a predator/fleet s.pred & prey s.prey:

s.pred__catch is just an internal counter really, it doesn't get reported but we need somewhere to sum all prey.

s.prey__predby_s.pred is what is reported, and contains all the breakdown you want, as fleets don't have any meaningful divisions of their own---even if this was a multi-area model, the area a fleet caught s.prey would be the same as the one it was predated in. However, with predator stocks this isn't the case; they have age & length of their own.

We need to either:

  1. Create a predation array with dimensions area, s.pred_age, s.pred_length, s.prey_age, s.prey_length. This could then be broken down any way you wanted, but I'd rather avoid this if we can. It'd be a very ungainly array to work with, we don't have any machinations to combine dimensions like this yet, and getting the order of dimensions right will be hard work.
  2. Create a s.pred__catchof_s.prey, the total biomass consumed in terms of s.pred's area/age/length. This won't give you any breakdown of the sort of prey that the predator is going for, but OTOH would be a lot easier to manage.

Any preferences? Or at least reasons why (2) isn't enough? I think a lot of it depends on the complexity of the selectivity function. If it's reasonably static then deriving the spread of prey from the predator attributes is fairly simple.

vbartolino commented 6 months ago

if I understand s.pred__catch and s.pred__catchof_s.prey have same resolution on the predator (area/age/length) but the first aggregated over all prey species while the second is by prey species, correct?

  1. Create a predation array with dimensions area, s.pred_age, s.pred_length, s.prey_age, s.prey_length...

for seal we can certainly get rid of the dimension s.pred_length. I wonder if it would make life easier for a more general case to have separate reporting (I've no idea how complicated this is...) rather than a full array to cover different possible combinations of interest like:

  1. area, s.pred_age, s.prey_age
  2. area, s.pred_age, s.prey_length
  3. area, s.pred_length, s.prey_age
  4. area, s.pred_length, s.prey_length

1 & 2 would be the most relevant for seal-vendace-herring (Formas) but I can see cases like the central Baltic model with cod-herring-sprat (MareFrame) where 3 & 4 would be more useful.

In addition to biomass, I think we should be able to track also prey numbers consumed which are needed to estimate the predation mortality (something we usually want to compare to fishing mortality), do you both agree?

lentinj commented 6 months ago

if I understand s.predcatch and s.predcatchof_s.prey have same resolution on the predator (area/age/length) but the first aggregated over all prey species while the second is by prey species, correct?

Exactly.

rather than a full array to cover different possible combinations of interest

It'd be easier from my point of view to have the full array, but being able to drop not-so-useful dimensions could be pretty useful to reduce memory requirements.

we should be able to track also prey numbers consumed

You can divide by mean weight, provided you have full-resolution total biomass of prey consumed. We do also have the ability to generate s.pred__catchnum, mostly because we have to for numberfleet.

lentinj commented 6 months ago

Another thought, dropping dimensions could happen at the reporting stage, rather than in predation itself.

lentinj commented 6 months ago

Right, I've a reworked g3a_predate() that produces a (predator)_(prey)__cons array, to replace (prey)__predby_(pred).

We don't drop the predator area dimension, as I might have alluded to at some point. In a model with a single area everywhere it's a minor annoyance, in a multi-area model it's meaningful (a fleet works in multiple areas, for which the stock is in only one.)

This can be a drop-in replacement for g3a_predate_fleet(), so long as we don't mind the output variables changing. For existing models, apart from the name change the only difference will be an extra 1-long "pred_area" dimension. The impedance mismatch could be easily dealt with in g3_fit I think. Any objections to doing something along these lines (versus preserving the old g3a_predate_fleet(), @willbutler42 ?

One thing I'm still yet to decide is the naming, (predator)_(prey)__cons means it's hard to pick apart the parts from rin_venimm__cons with a regex. rin:venimm__cons isn't valid R, so makes it a pain to type. rin_of_venimm__cons could work, at the risk of making more very long variables. Any preferences?

Also, I've figured out the Eigen magic required to multiply non-conforming arrays, so g3a_predate() should do a lot less looping, see #134.

bthe commented 6 months ago

I guess we have no problems with adding the area dimension to the g3_fit output, barring some minor reshuffling of the code in that function. I think none of the code so far has made any reference to area, or lack thereof, so I don't anticipate any issues downstream. But thinking about this and https://github.com/gadget-framework/gadget3/issues/152 , it is probably worth thinking about is the added predator length (and potentially age) dimension and how we deal with that generically in g3_fit (if at all). When I wrote the original g3_fit I cut a few corners as I assumed that all predators were only in one size group and selectivity was only dependent on the size of the prey.

Also, I've figured out the Eigen magic required to multiply non-conforming arrays, so g3a_predate() should do a lot less looping, see https://github.com/gadget-framework/gadget3/issues/134.

Awesome, look forward to testing this when implemented:)

lentinj commented 6 months ago

Yeah, it wouldn't take much to support either in g3_fit, if __cons is available group by the columns you're expecting, and sum the rest (i.e. "pred_area" in a fleet case).

In the future I think we can add accessor functions for derived values, e.g. g3r_consumption_biomass(report, predator, prey) (reports cons), g3r_consumption_biomass(report, predator, prey) (reports cons / wgt), g3r_consumption_overconsumption(report, predator, prey) (reports cons / (wgt * num) or so). You can then use [,,,drop = TRUE] and/or as.data.frame() to taste.

Doing any operations on arrays rather than data.frames will mean you don't really need to think about how many dimensions there are. Although we probably want to offer helpers that roll up everything apart from predator length, e.g.

lentinj commented 6 months ago

Switching over to __cons is quite a chunk of work for the likelihood function, so I think I'll put in a __predby_ compatibility layer for the time being, at least so the commits are a manageable size.

We'll have to look at converting the likelihood function later anyway.

lentinj commented 5 months ago

I think I've convinced myself that the answer is "combined", so feel free to ignore the below. But thoughts welcome if you like. I've written it now, so it seems a shame to just bin it :)

Do we think that the unit of the suitable stock (i.e. numbers/biomass/energycontent) is a separate property to catchability or combined?

Combined

We make catchability functions provide 2 formulas:

g3a_predate_catchability_totalfleet <- function (E) {
    list(
        F = quote( suit_f * stock_ss(stock__num) * stock_ss(stock__wgt) ),
        C = f_substitute(
            ~stock_ss(predprey__suit) * (E / total_predsuit),
            list(E = E)) )
}

g3a_predate_catchability_numberfleet <- function (E) {
    list(
        F = quote( suit_f * stock_ss(stock__num) ),
        C = f_substitute(
            ~stock_ss(predprey__suit) * (E / total_predsuit) * stock_ss(stock__wgt),
            list(E = E)) )
}

Then our future g3a_predate_catchability_predator would need to convert to energycontent in F, and back again in C.

Separate

Instead, we have an option for F / predprey_suit to be either in biomass or numbers (or energycontent).

g3a_predate <- function (
        predstock,
        prey_stocks,
        suitabilities,
        catchability_f,
        suit_type = c('biomass', 'number', 'energycontent'),
            . . .
        run_at = g3_action_order$predate) {

Then, g3a_predate_catchability_totalfleet and g3a_predate_catchability_numberfleet are the same function, because we're only varying the unit of predprey__suit / total_predsuit by setting suit_type:

g3a_predate_catchability_totalfleet_or_numberfleet <- function (E) {
    f_substitute(
            ~stock_ss(predprey__suit) * (E / total_predsuit),
            list(E = E)) )
}

Similarly, it means we can do a quotafleet by numbers instead of biomass without writing another function. In theory we could do a quota with energycontent, but that makes no sense :)

The main downside is explaining it is a lot harder, since the formluae used depends on lots of parameters in different locations. And making a g3a_predate_catchability_quotanum isn't an impossible hardship.

lentinj commented 5 months ago

In gadget2 the feeding level is scaled by overconsumption before being used for growth. Is this worth it? It seems like it'd be a fairly expensive operation for fairly little gain (in an actual understocking situation, the nll is scaled such that we immediately throw the model in the bin).

Then again, I'm starting to feel the same about the whole overconsumption dividing-prey-evenly-between-predators process.

vbartolino commented 5 months ago

In gadget2 the feeding level is scaled by overconsumption before being used for growth.

It makes sense that if a predator reaches overconsumption the feeding level is less than what calculated by eq. 4.23 and for any other process (ie, growth) it should be rescaled accordingly. However, I agree that it wouldn't make sense to use a model with overconsumption and that should be fixed first. The only situation I can think about is when a minor prey species has a low abundance which could easily trigger overconsumption but the model could still be considered acceptable on the overall picture and dynamics.

vbartolino commented 5 months ago

Combined

We make catchability functions provide 2 formulas:

g3a_predate_catchability_totalfleet <- function (E) {
    list(
        F = quote( suit_f * stock_ss(stock__num) * stock_ss(stock__wgt) ),
        C = f_substitute(
            ~stock_ss(predprey__suit) * (E / total_predsuit),
            list(E = E)) )
}

g3a_predate_catchability_numberfleet <- function (E) {
    list(
        F = quote( suit_f * stock_ss(stock__num) ),
        C = f_substitute(
            ~stock_ss(predprey__suit) * (E / total_predsuit) * stock_ss(stock__wgt),
            list(E = E)) )
}

I'm probably misunderstanding the formulations, but shouldn't F include the prey preference exponent d_p like in eq. 4.21?

lentinj commented 5 months ago

shouldn't F include the prey preference exponent d_p like in eq. 4.21?

It's a really good question. totalfleet, as implemented today, doesn't have any prey preference, which is why it's not included above. But the nascent g3a_predate_catchability_predator does:

g3a_predate_catchability_predator <- function (
        prey_preferences = 1,
        . . . ) {
    list(
        suit = f_substitute(quote(
                (suit_f * stock__energycontent * stock_ss(stock__num) * stock_ss(stock__wgt))^prey_preference
            ), list(
                prey_preference = list_to_stock_switch(prey_preferences) )),

It would be equally easy to add prey_preferences as an argument to g3a_predate(), and make it available regardless of the catchability method used. But (similar to the above) I reasoned it would be easier to see what a catchability method does if everything is in the formula, even if it means repeating ourselves a few times if we have to add prey_preference to totalfleet (e.g.).

lentinj commented 5 months ago

Right, this is getting somewhere now. The most interesting part is probably the final commit, with the g3a_predate_catchability_predator implementation, and it's defaults:

https://github.com/gadget-framework/gadget3/blob/037a916beac5d9faf00efcf3804e8d7987636d65/R/action_predate.R#L101-L138

energycontent & half feeding are implemented as non-optimised parameters by prey & predator respectively. This feels like the most "gadget3" way to do them. As with elsewhere the implementation could be customised by overriding the parameter, e.g. to be seasonal.

Seem reasonable? Are there any defaults you'd change?

The next step is to tidy up internal uses of __predby_ and replace with the new __cons. Once that's done, we can think about switching g3_fit over too and removing the compatibility layer.

The code to make feeding level available to growth also needs to be added. I have code locally to do this, but it doesn't take overconsumption into account yet.

vbartolino commented 5 months ago

I always thought about the feeding level as a function responding to prey density more than their absolute value. This might be relevant in the case of more than one area of different size, unless an area specific half_feeding_f is provided by the user. Do you agree?

lentinj commented 5 months ago

Maybe what we're missing is an extra term for area size? But that said, I can imagine other reasons for having an area-specific half_feeding_f. We can't do g3_parameterized('half_feeding', by_area = TRUE) yet, but it wouldn't be a problem to do so, just a case of copying the code for by_age.

vbartolino commented 5 months ago

We can't do g3_parameterized('half_feeding', by_area = TRUE) yet, but it wouldn't be a problem to do so

it seems a good idea to me

lentinj commented 5 months ago

Yes, g3_parameterized('half_feeding', by_area = TRUE) really is a problem, but by the time I'd remembered that I was too far down the road of sorting it. It's mostly done, but parked in #153 for now so we can get predation finished first.

lentinj commented 5 months ago

The stuff in the rewritten-predation branch starting to look useful. You can see a brief example of how to use it here: https://github.com/gadget-framework/gadget3/blob/rewritten-predation/tests/test-action_predate-predator.R

The main thing to think about before merging is what to do about the feeding level, in particular whether we need to compensate for understocking. I know how to do it, I'd just rather avoid it if I can :)

The other thing I'd like to have a go at is removing pred_area from the __cons array (new predator-prey arrays). In single-area models it doesn't matter, but carting about a heap of zeros for each area in the model is very wasteful. It creates some awkward edge cases for predation, but at least removing it at the time of reporting seems worthwhile. EDIT: This is done now.