Default g3_parameterized() parameters for g3_suitability_andersen

MikkoVihtakari commented 1 year ago

While trying to make plotting function for Andersen suitability here, I noticed that gadget3::g3_suitability_andersen behaves unexpectedly. This might be because I have misunderstood something or maybe it is a bug.

For example, when I try to plot the following parameters from my model

library(gadgetplots)
g3plot_andersen(1:120, p0 = 0, p1 = 0.659, p2 = 1, p3 = 0.15, p4 = 9942, p5 = 120)

This looks quite similar to the output I get from the model using this function. Yet, as a double check we made an Excel file using the equation here, and with the same parameters, we get tihs:

Screenshot 2023-02-23 at 12 24 01

I can send you the Excel sheet by email if you want. It's been controlled by two people and neither of us found mistakes in it.

When I dig into this, I notice that instead of the hard cut in the equation, g3_suitability_andersen uses a rather sharp gradient

gadget3::g3_suitability_andersen
#> function (p0, p1, p2, p3 = p4, p4, p5 = ~pred_stock__midlen) 
#> {
#>     f_substitute(~avoid_zero(p0) + avoid_zero(p2) * exp(-(log(avoid_zero_vec(p5/stock__midlen)) - 
#>         p1)^2/avoid_zero(p4)) * bounded_vec(100 * (p1 - log(avoid_zero_vec(p5/stock__midlen))), 
#>         0, 1) + avoid_zero(p2) * exp(-(log(avoid_zero_vec(p5/stock__midlen)) - 
#>         p1)^2/avoid_zero(p3)) * bounded_vec(100 * (log(avoid_zero_vec(p5/stock__midlen))) - 
#>         p1, 0, 1), list(p0 = p0, p1 = p1, p2 = p2, p3 = p3, p4 = p4, 
#>         p5 = p5))
#> }
#> <bytecode: 0x7f7f85d59170>
#> <environment: namespace:gadget3>

I.e. we get a mixture of those two functions:

> bounded_vec(100 * (0.659 - log(avoid_zero_vec(120/62))), 0, 1)
# [1] 0.5338819

If I replace gadget3::g3_suitability_andersen by this:

suitability_andersen <- function (p0, p1, p2, p3 = p4, p4, p5, len) 
{
  sapply(len, function(stock__midlen) {
    if(log(p5/stock__midlen) <= p1) {
      avoid_zero(p0) + avoid_zero(p2) * exp(-(log(avoid_zero_vec(p5/stock__midlen)) - p1)^2/avoid_zero(p4))
    } else {
      avoid_zero(p0) + avoid_zero(p2) * exp(-(log(avoid_zero_vec(p5/stock__midlen)) - p1)^2/avoid_zero(p3))
    }
  })
}

I'll get similar plot than with the Excel sheet from g3_suitability_andersen

^{Created on 2023-02-23 with reprex v2.0.2}

lentinj commented 1 year ago

In a relatively unrelated note, I added g3_eval recently, that lets you do things like this:

 g3_eval(
     g3_suitability_andersen(0,1,2,3,4,5),
     stock = g3_stock('prey', 1:10))

May well make your life easier. Apologies for not doing it earlier!

I don't think it's you. Your suitability_andersen() certainly looks like it matches the gadget2 user guide, so I'd expect the gadget3 version to match. I also briefly looked at the andersen function a week or so ago and not entirely convinced by it.

The if statement is the problematic part. For TMB to generate the list of operations applied, to generate the gradient function, we can't do that. As you may have guessed, that's what the bounded_vec(100 * (log(avoid_zero_vec(p5/stock__midlen))) - p1, 0, 1) is trying to work around.

I think that g3_suitability_andersen() gets p3 and p4 the wrong way around, and there's a bracket slip-up for - p1 in the final clause. Using the following that fixes both:

working_g3_suitability_andersen <- function (p0, p1, p2, p3 = p4, p4, p5 = ~pred_stock__midlen) {
  gadget3:::f_substitute(~avoid_zero(p0) +
                 avoid_zero(p2) * exp(-(log(avoid_zero_vec(p5/stock__midlen)) - p1)**2/avoid_zero(p3)) *
                 bounded_vec(100*(p1 - log(avoid_zero_vec(p5/stock__midlen))),0,1) +
                 avoid_zero(p2) * exp(-(log(avoid_zero_vec(p5/stock__midlen)) - p1)**2/avoid_zero(p4)) *
                 bounded_vec(100*(log(avoid_zero_vec(p5/stock__midlen)) - p1),0,1),
               list(
                 p0 = p0,
                 p1 = p1,
                 p2 = p2,
                 p3 = p3,
                 p4 = p4,
                 p5 = p5))
}

...g3_eval(working_g3_suitability_andersen(p0 = 0, p1 = 0.659, p2 = 1, p3 = 0.15, p4 = 9942, p5 = 120), stock = g3_stock('s', 1:120)) matches your implementation, at least by rough eyeball.

@bthe Any comments? I think it's a fairly clear-cut bug that needs fixing.

bthe commented 1 year ago

Nope, this line https://github.com/gadget-framework/gadget3/blob/master/R/suitability.R#L16 certainly looks like a bug.

MikkoVihtakari commented 1 year ago

Just a small addition to your code @lentinj, I think the multiplier parts (e.g. bounded_vec(100*(p1 - log(avoid_zero_vec(p5/stock__midlen))),0,1)) should always return 1 or 0 if I understood your code correct. For example this doesn't:

bounded_vec(100*(0.659 - log(avoid_zero_vec(120/62))),0,1)
# [1] 0.5338819

lentinj commented 1 year ago

I think the multiplier parts ... should always return 1 or 0 if I understood your code correct.

In ideal world, yes. But we can only approximate it with TMB. The gentle progression of curve(g3_eval(~bounded_vec(x,0,1)), -10, 10) from 0..1 is expected.

The 100 * is tightening this curve a bit, but maybe needs to be bigger.

bthe commented 1 year ago

The 100 * is tightening this curve a bit, but maybe needs to be bigger.

I have assumed that somewhere between 100 and 1000 should be enough for the typical use case. I guess there could cases where this is assumption fails, but in that case we could just define this multiplier as an extra argument to the function, with a default value of 100.

MikkoVihtakari commented 1 year ago

Ah, ok, so round(bounded_vec(100*(0.659 - log(avoid_zero_vec(120/62))),0,1), 0) wouldn't work?

MikkoVihtakari commented 1 year ago

Btw, our NEA Ghl benchmark delivery is hanging on this bug...Any chance fixing it soon?

lentinj commented 1 year ago

@MikkoVihtakari the above should improve matters, and is pretty close to the real version now.

MikkoVihtakari commented 1 year ago

Thanks @lentinj! I'll rerun iteration tonight and see if this fixes the issues we have been having in optimising Andersen parameters. Close when you're ready.

bthe commented 1 year ago

Ah, ok, so round(bounded_vec(100*(0.659 - log(avoid_zero_vec(120/62))),0,1), 0) wouldn't work?

Probably not as rounding is not differentiable.

lentinj commented 1 year ago

The original reason I started looking at g3_suitability_andersen was that I was suspicious all the uses of avoid_zero. Whilst writing the test for the above, it became obvious that avoid_zero(p0) wasn't achieving anything than inaccuracy, so ditched it.

The same could be said for avoid_zero(p2) I think, no NaNs will be generated by multiplying by zero.

In fact, even for p3, p4 & p5 seem redundant:

> exp(-(log(0)-0)**2/0)
[1] 0
> g3_eval(~ bounded_vec(1000*log(0),0,1))
[1] 1

Is there a reason for their existence other than ducking NaN generation?

bthe commented 1 year ago

The main reason for the excessive use of avoid_zero was to ensure that the value returned would be strictly positive, and to avoid hard crashes in g3_tmb_adfun when the initial parameters were inappropriate.

But things have changed quite a bit since, so I think this can be sorted out with appropriate bounds, although I would be tempted to exponentiate p4 and p5 by default.

MikkoVihtakari commented 1 year ago

It seems to produce expected results now:

gadgetplots::g3plot_andersen(1:120, p0 = 0, p1 = 0.659, p2 = 1, p3 = 0.15, p4 = 9942, p5 = 120)

^{Created on 2023-02-24 with reprex v2.0.2}

I made gadgetplots to use g3_eval instead of eval: https://github.com/gadget-framework/gadgetplots/commit/8d0e7535a0ee9dd6180b4a413faa6c44ead80b48

Thanks @lentinj!

MikkoVihtakari commented 1 year ago

The same fleet which I used as an example gets optimised to this now

gadgetplots::g3plot_andersen(1:120, p0 = 0, p1 = 0.641, p2 = 1, p3 = 1, p4 = -0.006, p5 = 120)

^{Created on 2023-02-24 with reprex v2.0.2}

There's obviously something wrong with my parameter bounds (I use g3experiments::g3l_bounds_penalty), but p4 should not become negative nevertheless. My bounds for it are c(0.000001, 10000). Perhaps exponentiating it is not a bad idea.

lentinj commented 1 year ago

Maybe what we need to do is instead of including avoid_zero in the formula, pass it in via. g3_parameterized like we've done with g3a_grow_lengthvbsimple and elsewhere. Something like:

g3_suitability_andersen <- function (
    p0 = g3_parameterized('andersen.p0', by_stock = by_stock, avoid_zero = TRUE),
    by_stock = TRUE,
    . . .) {

This would save faffing about filling the parameters, and can give more options for tweaking how the parameter is used.

lentinj commented 1 year ago

@MikkoVihtakari Any preferences for what the defaults / function signature should be? Seems only fair you get first dibs :)

Also, as a note to self, once this is done we can add a test of:

    for (x in 1:1000) {
        params <- list(
            p0 = runif(1, 0, 1),
            p1 = runif(1, 0, 1),
            p2 = runif(1, 0, 1),
            p3 = runif(1, 1e2, 1e5),
            p4 = runif(1, 1e2, 1e5),
            p5 = runif(1, 1, 120),
            stock__midlen = 1:120)
        offset <- 0
        ok(ut_cmp_equal(
            do.call(g3_andersen, params) + offset,
            do.call(nondiff_andersen, params),
            tolerance = 1e-6), paste0("g3_suitability_andersen matches ", deparse1(params)))
    }

MikkoVihtakari commented 1 year ago

I don't think I understood entirely why we need default values but I'll try to answer your question.

Default could be dome shape with median around average of stock__midlen vector. I would not pick values randomly for all those parameters because they interact with each other and make weird shapes.

p0 sets the lower y-limit and p2 the upper limit (p0+p2). Since we are estimating selectivity, these variables should be fixed to 0 and 1 respectively. p5 can be fixed to max(stockmidlen). Then there are three parameters left. p1 is related to log(p5/stockmidlen). Assuming that p5 is fixed as suggested above, log(2) would set the top of the dome to mean(stock__midlen) if I understand the equation correct. You could set bounds somewhere between 0 and 4 or something.

Then it's p3 and p4 left. p3 influences the left side while p4 the right side. 0 makes the slope almost infinite while 10000 makes the slope almost 0.

Could something like this be a reasonable starting point?

gadgetplots::g3plot_andersen(1:120, p0 = 0, p1 = 0.6931472, p2 = 1, p3 = 0.1, p4 = 0.1, p5 = 120)

^{Created on 2023-02-27 with reprex v2.0.2}

@bthe, what do you think?

bthe commented 1 year ago

The values that @MikkoVihtakari suggests as defaults are fine. But since we are on the subject it may be worth to put our thinking caps for how we define these suitability functions.

The parametrisation of the Andersen function comes from the Andersen and Ursin paper published in 1977, purely focusing on multi-species interactions, and the way we parametrise this function for fleets is fairly unintuitive. We could re-factor the suitability functions into a set of basic functions that could be added or multiplied together to define the desired form. At present the suitabilities are based on three basic form:

Logit: f(x) = 1/(1+exp(x))
Constant: g_c(x) = c
Dome: h(x) = exp(-x^2)

And in this frame the suitability functions can be written in terms of these functions, with the Andersen function being the most complicated, and could be written as something like:

S(l) = g_{p0}(l) +  p2 * (  h(-x/sqrt(p3))) * f(1000*x) + h(-x/sqrt(p4))) * f(-1000* x))

where x = (log(p5/l) - p1). But I have seen other forms of selection functions that could be set up as the sum or product of the three basic forms. Could this be generalised?

MikkoVihtakari commented 1 year ago

Do we need p0 from the old formula? As far as I understand it sets the minimum. That should always be 0, no?

Also p2 != 1 is questionable but could be there just to give the flexibility in case one needed it one day.

lentinj commented 1 year ago

Okay, so we could have a separate g3_suitability_andersenfleet that looks like:-

g3_suitability_andersenfleet <- function (
    p0 = g3_parameterized('andersen.p0', by_stock = by_stock, value = 0, optimise = FALSE),
    p1 = g3_parameterized('andersen.p1', by_stock = by_stock, value = log(2)),
    p2 = g3_parameterized('andersen.p2', by_stock = by_stock, value = 1, optimise = FALSE),
    p3 = g3_parameterized('andersen.p3, by_stock = by_stock, value = 0.1, exponentiate = exponentiate),
    p4 = g3_parameterized('andersen.p4, by_stock = by_stock, value = 0.1, exponentiate = exponentiate),
    p5 = quote( stock__upperlen ),
    by_stock = TRUE,
    exponentiate = TRUE) { 
  ~p0 + 
    p2 * exp(-(log(p5/stock__midlen) - p1)**2/p4) *
    bounded_vec(100*(p1 - log(p5/stock__midlen)),0,1) +
    p2 * exp(-(log(p5/stock__midlen) - p1)**2/p3) *
    bounded_vec(100*(log(p5/stock__midlen)) - p1,0,1),
}

EDIT: Now takes into account the 2 comments below

Hopefully then just g3_suitability_andersenfleet() should do something sensible in most cases.

~~Does it make sense to bring out both avoid_zero and exponentiate?~~ Anything else that would be reasonably common to tweak?

bthe commented 1 year ago

Looks like a good start, but do we need avoid_zero_vec(p5/stock__midlen) now?

Does it make sense to bring out both avoid_zero and exponentiate? Anything else that would be reasonably common to tweak?

Only thing I would think about here I guess is that exponentiating the variable is a cheaper than applying avoid_zero.

lentinj commented 1 year ago

do we need avoid_zero_vec(p5/stock__midlen) now?

It's certainly pointless in the default case. But could come into play if p5 != stock__upperlen.

You're right I think. We remove avoid_zero_vec(...), and say if you want to customize p5 then do something like p5 = g3_parameterized(..., avoid_zero = TRUE).

exponentiating the variable is a cheaper than applying avoid_zero.

Maybe we just do the latter then, and remove the avoid_zero parameter

gadget-framework / gadget3

Default g3_parameterized() parameters for g3_suitability_andersen #108