Open bthe opened 3 years ago
How many readers are you likely to define? I'm guessing we're talking a handful rather than 10's or 100's, seems to me that it'd be easier to define a separate catchdistribution component for each reader rather than add an extra "reader" dimension for a single component. If I'm not being stupid, then the main thing is applying a transformation to the model data to convert the age dimension to age-as-read-by-i.
You are right on all counts, the typical number of age readers for a particular species is usually counted on the fingers of one hand and effectively you are creating separate likelihoods for each reader where you would define reader specific age or maturity transformation.
Additionally we will need to define a parameter likelihood function where the control set reads are incorporated. Note that this is also what is needed to define random effects.
Sketching out the changes required for the first half a bit more. This could work with a g3l_likelihood_agereader
() to modify output of g3l_likelihood_distribution
(), insert an additional step between collation of modelstock vars and comparison to obsstock, to convert modelstock__num
(or __wgt
) to as-read age. The actual transformation will end up being implemented in a pretty similar manner to migration---it's the same really, just a different dimension.
The additional likelihood function I'm just about understanding, but think we'll need to go over it monday before it truly sinks in.
@bthe I've had a go implementing the housekeeping for this to work, have a look at the example script here: https://github.com/gadget-framework/gadget3/blob/age-reading-error-40/demo-age-reading-error/run.R I'm pretty sure it works :)
reader1matrix
g3l_catchdistribution
has an additional transform_fs
to transform the age dimension. Identical to migration, you give a formula to say how much goes from age --> destage. In this case, we just look up in a matrix.g3l_controlset
compares against controlsets.
g3l_controlset
?As suspected, g3l_controlset
is nonsense. The transform_fs
bit is good, but there needs to be an action that runs once at the beginning of the model that:
mean
, stddev
for each reader, forms transform matrix to use in transform_fs
. Default ones from paper, parametrised as they suggest.nll
the results of applying the mean
& stddev
on table through (5) in paper.This is useful without, since it may be the wiggle-room needed to e.g. account for variances between growth as observed by tagging data and age measurement.
The above merges transform_fs
.
For modelling the error matrix, consider implementing a TMB version of outer
to form the matrix.
Looking at what would be necessary to incorporate age reading errors in g3 I "stumbled upon" this paper on the subject. The idea is that given an age
a
an age readeri
will perceive the age to bea'
. So what is actually observed has a distributionp_i(a'|a)
which depends on the reader. To estimate this distribution function age readers are often asked to compare their readings using a control set of otoliths/bones/scales etc..The paper suggests that
p_i(a'|a)
could be a normal density with age varying meanb_ai
and errors_ai
of the form:where
c_Li
andc_Hi
are the min (L) and max age (H) values forb_a
ands_a
depending on the context.I guess that there are use cases where we would like to change the age reading function and/or the
f_i
function.To estimate the density function you use the control set (or sets if more than one) and minimize the following:
where
a
is the model age anda'_i
the observed age from reader 'i'.So to implement this in g3 we would need to do something like:
p_i(a'|a)
above). This would require an additional reader dimension in the likelihood data.nll
above) where age readings from a control set that two or more age readers have read and add to the model likelihood.The second step could be optional, either the user could estimate these parameters outside of the model and/or the data is simply not available so you may want to test different structures. Potentially tagging data could inform on this.
I guess the same sort of logic would apply to maturity data.