Open sgaichas opened 4 years ago
Sorry if I am missing the mark here, I did not review all of the code to see exactly what is going on, but does this assume that weight-at-age samples are independent of age-composition samples? I cannot think of any situation where you would age fish and weigh them and not use those ages in the model as marginal age compositions.
You are absolutely correct @kellijohnson-NOAA that we wouldn't leave out age samples if we had them. And thank you for helping me think this all the way through. So here is what I think we can do:
Atlantis outputs n at age and weight at age. We use the initial create_survey
and sample_fish
with an effN equal to the number of lengths measured, which we estimate with calc_age2length
because length is not tracked by Atlantis itself. Testing to date has kept all lengths, ages, and mean weight at age output from this step, which I think gives us an unrealistically large age sample, and also a mean weight at age based on this unrealistically large age sample. (This is assuming most surveys measure 10-100x more lengths than ages. If a survey measures as many lengths as ages, then we can stop here.)
So I think we can keep the length output of calc_age2length
, but the next step would be to run the n at age output of sample_fish
through sample_ages
to represent the age subsample (still a subset of fish originally collected on the survey and measured for length). Then we can re-run calc_age2length
with the subsample of ages output from sample_ages
(optionally with ageing error included) to get both a length composition of the subsampled aged fish and a mean weight at age based only on the subsampled aged fish.
An extra step, but much more representative of how (at least US) surveys work.
Does this make more sense?
In fisheries data, we don't always get lengths and ages for a given fish. Sometimes there will be age data with no length information. So, would you always want to just sample ages from those that are lengthed? In ss3sim we allow the sampling to be separate, where we sample from the truth two times, (1) for ages and (2) for lengths. Unless, the data are conditional age-at-length samples; where we would sample for length and take a total number of ages from those lengthed based on the distribution of lengthed fish, i.e., more ages from the most abundant length bin and fewer ages in the bins near the tails. Where many sampling protocols are length stratified and take an equal number per bin if available. But we don't allow for this latter kind of sampling in ss3sim.
If you are sampling ages from those that are lengthed and putting the ages into the model as marginal age-composition samples your information is not as independent as the model assumes because it is double counting each fish, i.e., assuming a length measurement is from and independent fish from the population and assuming an age measurement is from a new independent sample of the population.
Sorry if this is a bit in the weeds.
Not at all, I'd like to design this so users have options for different biological sampling methods and this is definitely helping.
We can make the age sample independent of fish sampled for length similarly to ss3sim if we re-do sampling at the sample_fish
stage with an effN that reflects the age sample. We can then run sample_ages
on this to add ageing error if necessary. We can get mean weight at age for this sample either by running calc_age2length
and ignoring or discarding the length output if it isn't wanted. That is a lot of overhead so I should write a simpler function to calculate mean weight at age only if lengths aren't used (extracting that bit from the calc_age2length
would probably work).
I would also rather avoid an option that mimics length-stratified sampling for age, so I'm glad to hear ss3sim doesn't allow it. I think we are treating the survey ages as conditional age at length in the CC Atlantis-based sardine assessment, but @cstawitz can confirm.
Yup, we are using CAAL in the Atlantis-sardine assessment, where every lengthed fish is aged.
in the species I have looked at in Alaska the effect of length-stratified sampling has been pretty minimal because the length bins they use are often very small. But it gets worse the larger the bins are. I have code that corrects for length-stratified sampling if that's of interest (though it sounds like we're not trying to replicate it in atlantisom)
On Fri, Oct 30, 2020 at 7:15 AM Sarah Gaichas notifications@github.com wrote:
Not at all, I'd like to design this so users have options for different biological sampling methods and this is definitely helping.
We can make the age sample independent of fish sampled for length similarly to ss3sim if we re-do sampling at the sample_fish stage with an effN that reflects the age sample. We can then run sample_ages on this to add ageing error if necessary. We can get mean weight at age for this sample either by running calc_age2length and ignoring or discarding the length output if it isn't wanted. That is a lot of overhead so I should write a simpler function to calculate mean weight at age only if lengths aren't used (extracting that bit from the calc_age2length would probably work).
I would also rather avoid an option that mimics length-stratified sampling for age, so I'm glad to hear ss3sim doesn't allow it. I think we are treating the survey ages as conditional age at length in the CC Atlantis-based sardine assessment, but @cstawitz https://github.com/cstawitz can confirm.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/r4atlantis/atlantisom/issues/42#issuecomment-719576721, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABWNMJQ35Y5MCD6XYMLGS7TSNLC3NANCNFSM4TEKRKXQ .
-- Christine Stawitz (she/her) cstawitz@uw.edu christine.stawitz@noaa.gov http://cstawitz.github.io
The way sampling is set up, effN for each survey is the number of fish measured for length, age, and average weight at age. We could introduce further realism by using the
sample_ages()
function to take a subsample for age composition and optionally apply an ageing error matrix (specified in the survey and fishery config files). The age comp based on a larger sample size and without error is still used to generate the length composition as input to calc_age2length, so we would have to run sample_ages after we generate lengths and weight at age.This would require changing the
om_comps()
wrapper:saved age comp objects remain the same this still means weight at age is from an unrealistically large age sample, could fix later