Consider increasing flexibility of time-varying selectivity for base model

iantaylor-NOAA commented 7 years ago

As noted in my comment on issue #175, the very large estimated size of the 2014 cohort is dependent on the fishery catching lots of age-2 fish in 2016 (and an above average amount of age-1s in 2015). However, changes in selectivity (or availability if the age-6 fish were more in Canada toward the end of the 2016 fishing season) could contribute to the observed proportions of these cohorts in the catch.

I've uploaded a model 33 to the Google Drive that has the standard error of the deviations on the selectivity parameters increased from 0.03 to 0.3. The original 0.03 value was based on an analysis that Jim Thorson and Allan Hicks conducted a few years ago (described in an open access paper here: https://academic.oup.com/icesjms/article/72/1/178/814992/Random-effect-estimation-of-time-varying-factors). However, they made some assumptions, including that the same standard error parameter was applied to control the flexibility at each age, when in fact it may be greater for some ages than others.

Ultimately, a model with greater flexibility in selectivity may simply be more plausible than one with an enormous 2014 recruitment.

aaronmberger-nwfsc commented 7 years ago

I think this is a great idea Ian, good thinking... and perhaps this can be compared with the base model in one of our sensitivity groups (for plotting) and leave out the max age selectivity sensitivities as they were in the document last year (unless they show something drastically different, but doubtful)

cgrandin commented 7 years ago

Great idea, I set up a new sensitivity group for it and made a couple of plots. See commit c63b1fe0ff6c889ea0544aee4d73ead835842077

iantaylor-NOAA commented 7 years ago

I remembered that Thorson's function for the Laplace Approximation described in the manuscript mentioned above is built into r4ss as "NegLogInt_Fn". The example function call in the help for this function appears to be unchanged from applying it to a hake model from the past, so I was able to update it to reflect the parameter numbers associated with our current range of years with time-varying selectivity and run it again.

I made a copy of the model 33 folder and used the following commands:

direc <- "C:/github/hake-assessment/models/33B_Sen_Laplace_test/"  
if("Optimization_record.txt" %in% list.files(direc)) {
  file.remove(file.path(direc,"Optimization_record.txt"))
}
Opt = optimize(f=NegLogInt_Fn,
    interval=c(0.002, 0.15),
    maximum=FALSE,
    File=direc,
    CTL_linenum_List=list(127:131),
    ESTPAR_num_List=list(88:217),
    Int_Group_List=1,
    PAR_num_Vec=NA,
    Version=1, Intern=TRUE)

This produced the following output

> Opt
$minimum
[1] 0.0291631779252872

$objective
[1] 384.890797617605

which I believe is suggesting that the 0.03 would be revised downward to 0.029.

However, I believe that this is a case where the pure statistical theory may not match well with the need for a plausible model so we should explore alternative criteria for setting these variance parameters.

iantaylor-NOAA commented 7 years ago

Note, to examine the variability of the estimated annual deviation in each selectivity parameter, if the model is named "model", these quantities have parameter labels like "AgeSel_1P_3_Fishery_DEVadd_1991" (where the first 1 designates the fleet number and the 3 designates the parameter number). The estimated values start with parameter 3 because Parameter 1 sets selectivity at age 0 to 0.0 and parameter 2 represents the change in selectivity from age 0 to age 1, which is fixed as a reference age since the maximum selectivity is scaled to 1, resulting in one fewer degree of freedom than the number of ages estimated.

Anyway, the standard deviation of the devs for each estimated parameter for the fishery can be found as

sd(model$parameter$Value[grep("AgeSel_1P_3_Fishery_DEV", model$parameters$Label)]) sd(model$parameter$Value[grep("AgeSel_1P_4_Fishery_DEV", model$parameters$Label)]) sd(model$parameter$Value[grep("AgeSel_1P_5_Fishery_DEV", model$parameters$Label)]) sd(model$parameter$Value[grep("AgeSel_1P_6_Fishery_DEV", model$parameters$Label)]) sd(model$parameter$Value[grep("AgeSel_1P_7_Fishery_DEV", model$parameters$Label)])

or we could generalize that by building the parameter label string in a loop.

cgrandin commented 7 years ago

Thanks for this Ian, this would've taken me a long time to figure out!

iantaylor-NOAA commented 7 years ago

Here's a potential support for a choice of time-varying selectivity flexibility SD of 0.20.

I looked at old assessment docs. In 2013, we had fixed selectivity in the base model and did a sensitivity looking at time-varying selectivity with 0.05 and 0.20 values for this SD quantity we're trying to set.

In 2014 time-varying selectivity with a SD value of 0.20 was included in the Operating Model of the MSE and we explored values of 0.05 and 0.20 in the Estimation Model. Table A.5 (page 120) of the 2014 assessment document shows that there's a big difference between no time-varying selectivity in the assessment, but not too much difference between the two values of 0.05 and 0.20, especially in the metrics related to depletion. However, in the metrics related to catch, increasing the SD value from 0.05 to 0.20 in the assessment reduces the probability of 0 catch (due to stock being estimated as below 10%,) from 3% to 1%.

Those results are dependent on the assumption that the variability in the OM is 0.20, and it would have been good to have a case where the OM was 0.05 and the EM was 0.20 (that is, overestimating the degree of time-varying selectivity), but we didn't seem to run that scenario.

In 2014, Thorson had also developed this Laplace Approximation approach to estimating the variability of this quantity, so the base model ended up with SD=0.03. See last 2 paragraphs of page iii in 2014 assessment (7th page of PDF) for discussion on this.

Nevertheless, in addition to Andy noting that results stabilize around 0.2, we could justify the choice of 0.2 as a value that was tested in an MSE context and performed as well or better than the smaller value (under the assumptions of a more-variable OM).

andrew-edwards commented 7 years ago

Here are the likelihoods (the 'TOTAL' row in model$likelihoods_used ) for some of the models for the sensitivity runs for the std error of selectivity (quickly doing it manually): std error likelihood 0.03 0.05 162 0.07 0.09 154.9 0.15
0.20 146.6 0.25 144.6 0.30 143.0

As Ian thought on the phone, there is no optimum in this range. (I assume these are -ve log-likelihoods, so 143 is the 'most likely'). [Writing this before reading Ian's latest comment]

andrew-edwards commented 7 years ago

Ian's comments sound good to me. It seems like 0.2 and other values behaved similarly before, and we find that the historical results don't change when the value is changed, so that seems consistent. [though is that because only the final few years are likely to show a change in estimated biomass etc.? - I can't quite get my head around it]. I have to leave in a minute but will look up the references later to try and understand it better, though happy to defer to Ian's expertise and rationale, as he understands this better.

allanhicks commented 7 years ago

Hi everyone, I'm exposing myself and the fact that I occasionally peek at the email notifications through Github. I figured that it is OK since this is all on public Github.

I apologize if I'm off-base since I'm not thinking about hake, but am thinking about the IPHC annual meeting next week, but I don't think that looking at the likelihood alone is not useful for a random effect parameter. According the Thorson et al (2014), there are two components to the log likelihood to maximize for a random effect is the summation of the log penalized likelihood and half of the negative log determinant of the Hessian matrix (equation 6 and preceding paragraph). This is what was used to determine the value of 0.03. I can't remember the explanation of why the two components pull against each other (Thorson may), but I do remember plotting the two components separately. It may be in a JTC presentation.

In addition, after we did the MSE work, we concluded that introducing even a small amount of variability resulted in significant improvement, and increasing that variation resulted in minimal gains.

Anyways, my Friday thoughts and feel free to ignore. Good luck on the document! Allan

PS: I have not looked at any of the document, just some of the emails that come through.

On Fri, Jan 20, 2017 at 2:14 PM, Ian Taylor notifications@github.com wrote:

Here's a potential support for a choice of time-varying selectivity flexibility SD of 0.20.

I looked at old assessment docs. In 2013, we had fixed selectivity in the base model and did a sensitivity looking at time-varying selectivity with 0.05 and 0.20 values for this SD quantity we're trying to set.

In 2014 time-varying selectivity with a SD value of 0.20 was included in the Operating Model of the MSE and we explored values of 0.05 and 0.20 in the Estimation Model. Table A.5 (page 120) of the 2014 assessment document shows that there's a big difference between no time-varying selectivity in the assessment, but not too much difference between the two values of 0.05 and 0.20, especially in the metrics related to depletion. However, in the metrics related to catch, increasing the SD value from 0.05 to 0.20 in the assessment reduces the probability of 0 catch (due to stock being estimated as below 10%,) from 3% to 1%.

Those results are dependent on the assumption that the variability in the OM is 0.20, and it would have been good to have a case where the OM was 0.05 and the EM was 0.20 (that is, overestimating the degree of time-varying selectivity), but we didn't seem to run that scenario.

In 2014, Thorson had also developed this Laplace Approximation approach to estimating the variability of this quantity, so the base model ended up with SD=0.03. See last 2 paragraphs of page iii in 2014 assessment (7th page of PDF) for discussion on this.

Nevertheless, in addition to Andy noting that results stabilize around 0.2, we could justify the choice of 0.2 as a value that was tested in an MSE context and performed as well or better than the smaller value (under the assumptions of a more-variable OM).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cgrandin/hake-assessment/issues/194#issuecomment-274194239, or mute the thread https://github.com/notifications/unsubscribe-auth/AF4OZS_VKKK-9a41G6mcdXWJ7F6z3qUjks5rUTG7gaJpZM4Lo3Ij .

iantaylor-NOAA commented 7 years ago

Andy, thanks for the likelihoods.

Allan, thanks for peeking. That's the benefit of a public repository. The issue is that the model with the small amount of time-varying selectivity estimates a recruit deviation of 3.3 for 2014, which results in an MLE estimate of 2014 recruitment which is 3-times larger than 2010 and seems to us to not be plausible.

I don't really understand the theory behind Thorson's approach, but yes, the monotonic improvement in likelihood is not accounting for the "cost" of more flexibility. Thorson's method attempts to estimate that cost to find a good balance, but there's no guarantee that it will provide a better assessment model.

I'll share some further thoughts in a subsequent comment once I gather them.

aaronmberger-nwfsc commented 7 years ago

We knew you couldn't stay away....!

Yeah, Ian mentioned what you explained during our call today. We (I) don't fully understand the pull as you say, but do understand about the principle of random effects and likelihoods. We decided to not use the likelihood as a determinant, just was one thing to look at to see how it changed as SD was adjusted upwards. We'd like to explore this further, perhaps revisiting your and JTs analysis during the next off season.

allanhicks commented 7 years ago

I may be able to help re-run Thorson's analysis to re-estimate the variance. Does the estimated deviate in 2014 decline to reasonable values?

I think that calculating Thorson's likelihood would be simple in the profile that you have already done, but I've already interfered enough.

Good luck, allan

On Fri, Jan 20, 2017 at 2:46 PM, Ian Taylor notifications@github.com wrote:

Andy, thanks for the likelihoods.

Allan, thanks for peeking. That's the benefit of a public repository. The issue is that the model with the small amount of time-varying selectivity estimates a recruit deviation of 3.3 for 2014, which results in an MLE estimate of 2014 recruitment which is 3-times larger than 2010 and seems to us to not be plausible.

I don't really understand the theory behind Thorson's approach, but yes, the monotonic improvement in likelihood is not accounting for the "cost" of more flexibility. Thorson's method attempts to estimate that cost to find a good balance, but there's no guarantee that it will provide a better assessment model.

I'll share some further thoughts in a subsequent comment once I gather them.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cgrandin/hake-assessment/issues/194#issuecomment-274200407, or mute the thread https://github.com/notifications/unsubscribe-auth/AF4OZcJ-4Ru5Si8TsRJgtVIO3AogUEiMks5rUTlRgaJpZM4Lo3Ij .

iantaylor-NOAA commented 7 years ago

Allan, we thought the same, but managed to repeat it already and it produced almost identical results. See email in this thread from 4 hours ago.

iantaylor-NOAA commented 7 years ago

Here are two more things that I realized while looking at the model.

First, looking at the vector of deviations over time shows that the largest deviations are estimated for 2015 and 2016, so my suggestion to look at the standard deviation of the estimated deviations may not make sense.

That is, instead of computing the SD of the vector using sd(model$parameter$Value[grep("AgeSel_1P_3_Fishery_DEV", model$parameters$Label)]) you could save and plot the vector of values

vec_P3 <- model$parameter$Value[grep("AgeSel_1P_3_Fishery_DEV", model$parameters$Label)]
vec_P4 <- model$parameter$Value[grep("AgeSel_1P_4_Fishery_DEV", model$parameters$Label)]
vec_P5 <- model$parameter$Value[grep("AgeSel_1P_5_Fishery_DEV", model$parameters$Label)]
vec_P6 <- model$parameter$Value[grep("AgeSel_1P_6_Fishery_DEV", model$parameters$Label)]
vec_P7 <- model$parameter$Value[grep("AgeSel_1P_7_Fishery_DEV", model$parameters$Label)]
plot(0, type='o', xlim=c(1991,2016), ylim=c(-.3,.3), ylab="deviation")
lines(1991:2016, vec_P3, col=1)
lines(1991:2016, vec_P4, col=2)
lines(1991:2016, vec_P5, col=3)
lines(1991:2016, vec_P6, col=4)
lines(1991:2016, vec_P7, col=5)

Just remembered that I can just drag images, so will do that here.

sel_devs_test

That suggests that tuning the flexibility based on the estimated values for all years might not be right if the most recent years really did have more unusual ocean conditions and potentially more different selectivity than in the past.

Second, one reason why the change in time-varying selectivity impacts the 2014 recruitment so much, but not the earlier part of the time series is that the "End year standard recruitment devs" defined on line 66 of the control file is set to 2013 in the models we've been using. Since there is information about the 2014 recruitment in the data, I think we should change this value to 2014.

I've uploaded a model 40 which has this change implemented. The time-series is almost identical but the equilibrium recruitment shows a change of about 10% because the large 2014 recruitment is now included in the set of values that are more directly connected to the equilibrium around which the recruitments vary.

For more info on the setup in Stock Synthesis which allows a separate vector of "early", "main", and "late"+"forecast" recdevs see the SS 3.24 User Manual "Appendix A: Recruitment Variability and Bias Corrections" starting on page 125.

In conclusion my suggestions based on all this thinking are

Go forward with a time-varying selectivity SD of 0.20 unless we find some better way to choose a value greater than 0.03 (which produces implausibly high 2014 recruitment estimates)
Change the "End year standard recruitment devs" from 2013 to 2014.

aaronmberger-nwfsc commented 7 years ago

sounds good to me - model run 20 is a sensitivity to including the 2014 into the standard devs time series, which may be a the same as model 40 unless model 40 also uses the SD=.20

aaronmberger-nwfsc commented 7 years ago

just downloaded model 40 and it is strictly a sensitivity to 2014 (SD=0.03 for tv-select) so this will be the same as model 20. But I still agree that there is enough info now to include 2014 into the main period

iantaylor-NOAA commented 7 years ago

Not to make things too complicated, but there is a subtle difference between models 20 and 40. Model 20 looked at the sensitivity of changing the range of years with the maximum bias adjustment from 2013 to 2014 (line 76 of control file), while model 40 changed the end of the vector of "main" recdevs from 2013 to 2014 (line 66 of control file).

The common practice with the bias adjustment is to have the period with max adjustment cover those years with the most information about recruitment and taper off at the end as the uncertainty increases. 2014 recdev is definitely more uncertain than 2013 and before so I think keeping the max bias adjustment on line 76 at 2013 makes sense.

However, given that there is definitely SOME information about the 2014 recruitment, even if it's really uncertain, suggests that it should be included in the main vector of deviations, not a separate vector that extends into the forecast.

iantaylor-NOAA commented 7 years ago

I forgot to mention that the bias adjustment issues go away when you do MCMC because the MCMC explores the uncertainty within each recdev as well as variability among them, so an MCMC for models 19 and 20 should produce identical results.

But including 2014 in the main vector of recdevs (which is constrained to sums to 0) will have some impact on both MLE and MCMC and should cause the equilibrium to better reflect the "average" recruitment, including the uncertain estimate for 2014.

So my suggestion (which I'm open to moving away from) would be to take model 40, change the SD of the time-varying selectivity parameters form 0.03 to 0.20 (or some other value), and consider that as a potential new base model and run the MCMCs on it.

andrew-edwards commented 7 years ago

Need a clarification (for when we write up): in the 2016 assessment (p39; copied from 2015) phi=0.03 is defined as the standard deviation for the penalty function. In 2014, on page iii it says phi is the 'random effects variance' [not standard deviation], although phi^2 is used on p144. So I presume phi is the standard deviation (just a typo in page iii), yes?

Probably worth having the half-page of 2014 Appendix C text in this year's assessment if we're going to change the value from 0.03, rather than just refer to 2014. I can do that next week.

andrew-edwards commented 7 years ago

Agree with Ian's latest comments. Don't fully understand it all but seems logical. So, modifying model 40 with 0.20, and use that as base run (presumably will look similar enough to the sensitivity we did earlier).

iantaylor-NOAA commented 7 years ago

I think 2014 page iii is typo in using the term variance.

I look in TPL file for SS version 3.24S and it has the line

 for(j=MGparm_dev_minyr(i);j<=MGparm_dev_maxyr(i);j++)
 {parm_dev_like += 0.5*square( MGparm_dev(i,j) / MGparm_dev_stddev(i) );}

where the MGparm_dev_stddev is the value from the 12th column of the parameter line. If we assume that quantity is phi, then this matches matches the equation on page 144 (170th page of PDF) in the 2014 assessment and we should just continue to call it a "standard deviation for the penalty function" and not use the term "variance" unless we square it.

iantaylor-NOAA commented 7 years ago

Sounds good. I'm watching soccer while running all.r and slowly trying to make progress on a few of the issues that are assigned to me.

iantaylor-NOAA commented 7 years ago

Oops, my previous reply was supposed to be for issue #195. I think I'll close this issue to clean things up as it seems we've decided to go forward with the more flexible time-varying selectivity unless something in model 41 looks unexpectedly problematic.

pacific-hake / hake-assessment

Consider increasing flexibility of time-varying selectivity for base model #194