gadget-framework / gadget3

TMB-based gadget implemtation
GNU General Public License v2.0
8 stars 6 forks source link

Accounting for age reading errors #40

Open bthe opened 3 years ago

bthe commented 3 years ago

Looking at what would be necessary to incorporate age reading errors in g3 I "stumbled upon" this paper on the subject. The idea is that given an age a an age reader i will perceive the age to be a'. So what is actually observed has a distribution p_i(a'|a) which depends on the reader. To estimate this distribution function age readers are often asked to compare their readings using a control set of otoliths/bones/scales etc..

The paper suggests that p_i(a'|a) could be a normal density with age varying mean b_ai and error s_ai of the form:

f_i(a) = c_L + (c_Hi - c_Li) * (1-exp(-alpha_i(a - L))/(1-exp(-alpha_i*(H-L)) 

where c_Li and c_Hi are the min (L) and max age (H) values for b_a and s_a depending on the context.

I guess that there are use cases where we would like to change the age reading function and/or the f_i function.

To estimate the density function you use the control set (or sets if more than one) and minimize the following:

nll = -log(sum_a(beta_a * prod_i(p_i(a'_i|a))))

where a is the model age and a'_i the observed age from reader 'i'.

So to implement this in g3 we would need to do something like:

The second step could be optional, either the user could estimate these parameters outside of the model and/or the data is simply not available so you may want to test different structures. Potentially tagging data could inform on this.

I guess the same sort of logic would apply to maturity data.

lentinj commented 3 years ago

How many readers are you likely to define? I'm guessing we're talking a handful rather than 10's or 100's, seems to me that it'd be easier to define a separate catchdistribution component for each reader rather than add an extra "reader" dimension for a single component. If I'm not being stupid, then the main thing is applying a transformation to the model data to convert the age dimension to age-as-read-by-i.

bthe commented 3 years ago

You are right on all counts, the typical number of age readers for a particular species is usually counted on the fingers of one hand and effectively you are creating separate likelihoods for each reader where you would define reader specific age or maturity transformation.

Additionally we will need to define a parameter likelihood function where the control set reads are incorporated. Note that this is also what is needed to define random effects.

lentinj commented 3 years ago

Sketching out the changes required for the first half a bit more. This could work with a g3l_likelihood_agereader() to modify output of g3l_likelihood_distribution(), insert an additional step between collation of modelstock vars and comparison to obsstock, to convert modelstock__num (or __wgt) to as-read age. The actual transformation will end up being implemented in a pretty similar manner to migration---it's the same really, just a different dimension.

The additional likelihood function I'm just about understanding, but think we'll need to go over it monday before it truly sinks in.

lentinj commented 3 years ago

@bthe I've had a go implementing the housekeeping for this to work, have a look at the example script here: https://github.com/gadget-framework/gadget3/blob/age-reading-error-40/demo-age-reading-error/run.R I'm pretty sure it works :)

lentinj commented 1 year ago

As suspected, g3l_controlset is nonsense. The transform_fs bit is good, but there needs to be an action that runs once at the beginning of the model that:

This is useful without, since it may be the wiggle-room needed to e.g. account for variances between growth as observed by tagging data and age measurement.

lentinj commented 1 year ago

The above merges transform_fs.

For modelling the error matrix, consider implementing a TMB version of outer to form the matrix.

lentinj commented 1 year ago

out