gadget-framework / gadget3

TMB-based gadget implemtation
GNU General Public License v2.0
8 stars 6 forks source link

g3l_random_dnorm(.. g3_parameterized('init',by_age = TRUE) ..) doesn't generate anything sensible #101

Open lentinj opened 1 year ago

lentinj commented 1 year ago

@bthe:

I'm trying to apply the g3l_random_dnorm to the initial conditions:

 g3l_random_dnorm('initial_condititions',
                   substitute(log(avoid_zero(x)),
                              list(x = g3_parameterized('init',by_age = TRUE,random = penalise==0, stocks))),
                   weight = g3_parameterized('rnd_weight',scale = -1),
                   sigma_f = g3_parameterized('ghl_recruitment_sigma'))

And the compiler complains about:

/tmp/RtmpCY570l/g3_tmb_f569e6977d5d55c234b604b97d41a5a0b4d6e002.cpp:2435:97: error: ‘age’ was not declared in this scope
 2435 |                 auto n = dnorm(log(avoid_zero( *map_extras::at_throw(ghl__init, std::make_tuple(age), "ghl.init"))), (double)(0), ghl_recruitment_sigma, true);

Any idea why this happens? Looking at the generated code it does not seem to loop over ages.

g3_parameterized Is expecting to be used as part of something that's already looping over a stock, so doesn't do any looping itself. Unfortunately g3l_random_dnorm doesn't know how to loop over a stock either, and even if it did (which isn't easy to do without ugly hacks), that wouldn't solve the each-year case.

There isn't anything to iterate over everything a g3_param_table() generates, but that's what needs to happen here. It'd be very awkward to add though.

Another way around would be converting the table into a fully-fledged stock array. Then iterating over each entry would become trivial, but instead the complexity bubbles up when actually using it, as we don't have a means to intersect 2 stocks, but with a default value.

So not easy, in short.

The cheating way around this is to do the legwork yourself, i.e. for each parameter, generate:-

 g3l_random_dnorm('initial_condititions',
                   substitute(log(avoid_zero(g3_param(x, random = y))),
                              list(x = 'init.5.1994', y = penalise==0))),
    . . .

Me @willbutler42 had to do something similar elsewhere, but I can't think where...

bthe commented 1 year ago

So something like what the bounds likelihood does: https://github.com/gadget-framework/g3experiments/blob/master/R/likelihood_penalty.R#L28 would do the the trick?

lentinj commented 1 year ago

Yeah, that's what I was trying to remember :) Replace the logspace_add()s with a call to dnorm() and you should be pretty much done.

bthe commented 1 year ago

Exactly, for the time being I'm trying:

 min_age:maxage %>% 
         purrr:::map(function(age) 
           g3l_random_dnorm('initial_condititions',
                            substitute(log(avoid_zero(x)), 
                                       list(x = g3_parameterized(sprintf('ghl.init.%s',age),
                                                                            random = penalise==0))),
                            weight = g3_parameterized('rnd_weight',scale = -1),
                            sigma_f = g3_parameterized('ghl_init_sigma')))

which does not complain, which is always good sign. Whether this produces anything sensible is another matter;)

lentinj commented 1 year ago

It should do. The main risk here is if your g3_parameterized() calls result in different defaults to where you actually use them, but looks like you've already taken care of that.

bthe commented 1 year ago

Kicking the tires a bit, I as trying time- and age-varying M:

expand_grid(age = min_age:maxage,
                year = defaults$year) %>% 
      mutate(param_name = sprintf('ghl.M.%s.%s',year,age),
             m_param = sprintf('ghl.M.%s',age),
             id = 1:n()) %>%
      split(.$id) %>% 
      map(function(x) 
        g3l_random_dnorm(sprintf('nat_mort_%s',x$param_name),
                         g3_parameterized(x$param_name,random = penalise==0),
                         mean_f = g3_parameterized(x$m_param),
                         weight = g3_parameterized('rnd_weight',scale = -1),
                         sigma_f = g3_parameterized('ghl_M_sigma')))

which results in a huge list of actions. Adding that to the main actions list and pass to g3_to_* results in the following error:

Warning messages: 1: In value[3L] : Error in get(var_name, envir = env, inherits = TRUE): object 'cur_time' not found ...

where the resulting code only has the nat_mort_* sections. Is there an upper limit to the number of actions that go into g3_to_* ?

Also, it maybe worth noting that adding the penalty for each parameter, like I do with the initial numbers at age, increases the computation time considerably.

lentinj commented 1 year ago

Error in get(var_name, envir = env, inherits = TRUE): object 'cur_time' not found

My bet would be the list structure of actions has been messed up. The g3a_* functions produce lists of formulas, not straight formulas. You then concatenate these lists. If you try to concatenate a list to a formula, R makes odd assumptions. Has one of the lists been unwrapped in your processing?

bthe commented 1 year ago

Yes, it certainly looks like it, here is the more complete entry:

random_actions <- 
  c(list(g3l_random_walk('growth',
                         substitute(log(avoid_zero(x)), 
                                    list(x =g3_parameterized('ghl.K',by_year = TRUE,random = penalise==0))),
                         weight = g3_parameterized('rnd_weight',scale = -1),
                         sigma_f = g3_parameterized('ghl_K_sigma')),
         g3l_random_walk('recruitment',
                         substitute(log(avoid_zero(x)), 
                                    list(x =g3_parameterized('ghl.rec',by_year = TRUE,random = penalise==0))),
                         weight = g3_parameterized('rnd_weight',scale = -1),
                         sigma_f = g3_parameterized('ghl_recruitment_sigma'))),
    min_age:maxage %>% 
      map(function(age) 
        g3l_random_dnorm(sprintf('initial_condititions_age%s',age),
                         substitute(log(avoid_zero(x)), 
                                    list(x = g3_parameterized(sprintf('ghl.init.%s',age),
                                                              random = penalise==0))),
                         mean_f = g3_parameterized('zero',scale = 0),
                         weight = g3_parameterized('rnd_init_weight',scale = -1),
                         sigma_f = g3_parameterized('ghl_initial_sigma'))),

    expand_grid(age = min_age:maxage,
                year = defaults$year) %>%
      mutate(param_name = sprintf('ghl.M.%s.%s',year,age),
             m_param = sprintf('ghl.M.%s',age),
             id = 1:n()) %>%
      split(.$id) %>%
      map(function(x)
        g3l_random_dnorm(sprintf('nat_mort_%s',x$param_name),
                         g3_parameterized(x$param_name,random = penalise==0),
                         mean_f = g3_parameterized(x$m_param),
                         weight = g3_parameterized('rnd_weight',scale = -1),
                         sigma_f = g3_parameterized('ghl_M_sigma'))))

Commenting out the nat_mort part of the random_actions list results in something that compiles, while including it fails. Any ideas?