Sex structure - Githubissues

ebuhle / LCRchumIPM

This is the development site for an Integrated Population Model for chum salmon in the lower Columbia River.

MIT License

4 stars 1 forks source link

Sex structure #8

Closed ebuhle closed 2 years ago

ebuhle commented 3 years ago

This exchange in #6 got me thinking, it really wouldn't be that difficult to include sex structure in a specific, narrow sense. To wit, simply use the observed sex-frequency data c(n_M_obs, n_F_obs) to estimate the proportion female p_F independently for each year in a manner directly analogous to p_HOS and c(n_W_obs, n_H_obs). Use p_F to calculate expected egg production E_hat (replacing the fixed assumed value of 0.5): E_hat = f * p_F * S, where f is age-weighted fecundity as before.

So I went ahead and tried it. As is the case with p_HOS, each element of p_F has an independent Unif(0,1) prior, although that could be tightened slightly around 0.5 to better handle cases where n_M_obs + n_F_obs == 0 (these are mostly in terminal years 2019 where bio_data are not yet available, although there are a few interior cases as well). In order to constrain p_F when forecasting, we would have to include future sex-composition "data" -- say, c(1000, 1000) for a tightly constrained 50:50 sex ratio. (The same kludge would work to constrain future p_HOS at nonzero values using H/W-composition "data".)

The estimates of p_F are faithful to the data, but the resulting estimates of mu_psi, psi and apparent egg-to-smolt survival are, surprisingly, barely different from the previous version. In fact, in some cases apparent survival in the sex-structured model is actually a bit higher (e.g., compare here). Computational performance is about equally bad as before.

So I guess the question is: should we keep this feature just to say we considered sex ratio and preempt a potential criticism, to make use of some rarely available data, and (possibly) to provide a starting point for sex-structured parameters / states in the future if desired? I don't see much of a downside apart from the slight increase in cognitive overhead. Maybe the diffuse prior instead of a fixed value of 0.5 is a bad thing in low-information years? (Although, again, it doesn't seem to matter much in practice.)

kalebentley commented 3 years ago

Sweet. I did a quick side-by-side "eyeball" evaluation of the previous estimates of FW survival you posted here versus the bottom plot above and the only population that appears to have changed whatsoever is Hamilton_Channel. This is obviously not surprising given the sex ratio plot above and Todd's explanation of how the Hamilton_Channel has been monitored for adults especially in the more recent years (see here).

I'm not entirely sure what you mean by "cognitive overhead" but if there is no real penalty to keeping sex structure in the model then I would say keep it.

ebuhle commented 3 years ago

Well, it was a slippery slope from "just do the minimal version" to "might as well do it right".

As described in the OP, the initial version of sex structure simply estimated the proportion of females independently by return year, analogous to the parameterization of p_HOS. That approach works OK but, as shown above, is too noisy when few or zero sex-composition samples are available in a given year. It also makes more sense biologically to think of the sex ratio as determined at the cohort level. Like SAR and conditional age-at-return, it is a complex function of the sex-specific annual marine survival and maturation schedule (at least that's what Tom Quinn's book says; there doesn't seem to be much known about it) that we can simplify by modeling it conditional on return.

The updated approach defines p_F by outmigration cohort, analogous to conditional age-at-return p. Like p, p_F has a two-stage hierarchical model with ESU-level mean mu_F, among-pop logit-scale SD sigma_pop_F , and within-pop annual SD sigma_F. The initial 1:max_ocean_age values of q_F (proportion female by return year, used to weight per capita fecundity) are given a beta(3,3) prior to mildly regularize toward 0.5, but just as with spawner age structure q, the initial values are only applied to the orphan cohorts, while modeled values are used for any age classes generated from previous outmigration years. The assumption remains that adult sex ratio is independent of age and origin (in the multi-way contingency table sense), but now we'll be able to examine correlations among the respective link-scale random effects to see whether that assumption is justified.

Between this and the catastrophically bad 2020 outmigration (?!), egg-to-smolt survival ironically is looking a little more reasonable. Hopefully bringing in habitat area offsets (note the wonky Mmax posteriors) and partially spawned females will further help with psi and tame the remaining divergences.

kalebentley commented 3 years ago

Right on re: updated sex-structure approach.

Although it is off-topic from this thread, I did want to quickly respond to your comment regarding "the catastrophically bad 2020 outmigration" (which I think corresponds to the plug in the "smolt recruitment anomaly" in plot E above).

As you observed, the estimated chum fry outmigrations in the spring of 2020 were record lows for all four of our monitoring locations - Hamilton_Channel, Duncan_Channel, Grays_MS, and Gray_CJ. Not surprisingly, the spawner abundances were pretty low in the fall of 2019 (and insanely low for CJ).

One of the contributing factors for low spawner abundance was extremely low water during the fall of 2019. Specifically, at Hamilton_Channel and Grays_CJ, spawners couldn't get into the spawning areas (easily) and almost the entire South Channel in Duncan was dry (hence why the South Channel wasn't even monitored for juveniles in 2020). @Hillsont and @BradGarnerWDFW could provide more context if needed. Since the outmigration was so low in Grays_CJ, it makes some sense that the outmigration at Grays_MS was low as well even though the spawner abundance in Grays_MS and Grays_WF were closer to "average".

As an aside, I suppose it possible (likely?) that some of the spawners originally destined for CJ spawned in WF and MS. Not sure if this will have any ramifications when we start trying to model straying.

Anyhow, I mention all of this both as an FYI but also to highlight that again our juvenile time series may not be fully reflective of what's going on at other locations. It's unclear to me whether the FW survival "anomalies" at our juvenile monitoring sites would affect the estimates at our non-monitored sites (aside from contributing to the hierarchically estimated parameters)...

ebuhle commented 3 years ago

@kalebentley's comment on the factors involved in the record-low 2020 outmigration, along with some detailed exploration of the wonky Mmax posteriors shown above and the correlated patterns in psi, have further convinced me that not accounting for habitat area (in both a spatially and temporally varying sense) is likely contributing to these pathologies we've been seeing in the FW productivity parameters. I was going to write a long comment about it, but I'll hold off since it sounds like we're on the cusp of getting those area estimates.

Anyhow, I mention all of this both as an FYI but also to highlight that again our juvenile time series may not be fully reflective of what's going on at other locations. It's unclear to me whether the FW survival "anomalies" at our juvenile monitoring sites would affect the estimates at our non-monitored sites (aside from contributing to the hierarchically estimated parameters)...

Indeed they would and do, through exactly that mechanism. You can see this in panel E above, where the pop-specific spawner-to-smolt process errors all take a nosedive in brood year 2019 along with the hyper-mean. That's basically baked into the study design and the use of a small subset of smolt monitoring sites to represent the ESU, regardless of analysis method. Short of expanding smolt trapping, the only obvious way to decouple the trends at different sites is to model the FW params as functions of spatiotemporally varying covariates. That said, it sounds like the low flow in 2019 would likely have impacts throughout the ESU.

As an aside, I suppose it possible (likely?) that some of the spawners originally destined for CJ spawned in WF and MS. Not sure if this will have any ramifications when we start trying to model straying.

Yeah, that's a tough one. I need to open an issue on straying / hatcheries, but we can anticipate that it's going to be challenging to estimate dispersal (esp. time-varying) within a watershed where none of the adults can be assigned to a subpopulation of origin.

Hillsont commented 3 years ago

I would say highly likely, or even certainly re: due to low water in most of Nov 2019, that adults who wanted to spawn in CJ spawned elsewhere in the basin. It's too bad we don't use PBT for origin determination in the Coast strata so we had some clues about spawning site fidelity within the Grays basin. Over the last couple of years, we've has used "all adults" instead of just brood or channel adults in the adult pool for the PBT analysis in the lower Gorge strata and it's evident that for mainstem Columbia spawning sites and near Bonneville tribs., spawning site fidelity is something less than 100% even in "normal" water years.

The 2019 Grays return took a double hit. Low water in most of November prevented access to what we believe is the higher productive spawning area in CJ. Lower than normal water levels in the WF and MS would have forced spawning to take place more in the center portions of the gravel bed making those redds more susceptible to winter flood events (scour and siltation). There were several high flow events over the winter of 2019-20 with at least three of them reaching major flooding heights (stage >16').

The two monitored channels (Duncan and Hamilton Springs) also experienced high flow events over the winter but nothing in the range that make me think FW survival would be impacted. I don't have the BY 2019 spawners and outmigrant estimates in hand, so It's hard to say how far outside of expected they were.

tbuehrens commented 3 years ago

just tuning in for a sec...this is looking better. RE sex, Eric, I have modeled sex ratio as a logitnormal random walk or mean reverting process before...this would negate the need for teh kluge in forecasting and result in partial pooling of data across years. I too agree that the lack of habitat offsets are a problem that is almost certainly affecting egg to fry productivity estimates...lack of offset will tend to pull down capacity for big pops (which are not spawning channels), and pull up capacity for spawning channels...this pulling on capacity will have the inverse effect on productivity--pulling channels down and mainstem pops up, which is also occuring due to the random effect prior on pop specific egg to fry productivity. once we have habitat offsets for capacity, it should look even better.

ebuhle commented 3 years ago

RE sex, Eric, I have modeled sex ratio as a logitnormal random walk or mean reverting process before...this would negate the need for teh kluge in forecasting and result in partial pooling of data across years.

Right, the kludge mentioned in the OP is no longer necessary and partial pooling helped make the estimates less noisy. The only difference between this approach and what you're describing is that annual p_F is IID. There's no real evidence of autocorrelation; a little bit in Hamilton Channel, but even there it's not "significant" at any lag.

One striking feature of the data is that Grays WF is consistently > 50% female. What do you guys think accounts for this?

lack of offset will tend to pull down capacity for big pops (which are not spawning channels), and pull up capacity for spawning channels...this pulling on capacity will have the inverse effect on productivity--pulling channels down and mainstem pops up

Bingo, this is exactly what appears to be happening. The hyper-mean mu_Mmax is driven by the most informative (in terms of length and precision) smolt time series, namely Hamilton Channel (reinforced to some extent by Grays CJ) . Since those populations are roughly an order of magnitude larger than Duncan Channel, the latter gets dragged up, which drives its psi estimate down. You can see this in the almost bimodal density here, where the points are the bivariate posterior for each pop and the contours are the bivariate hyper-means. There's also some full-on bimodality in Multnomah, which doesn't show up clearly here but is visible in panel C of the figure above.

This will also do weird things to population-level average SAR, especially in pops that don't have local smolt data to resolve FW vs. marine survival. Anyway, we're off-topic, but I agree habitat offsets ought to help.

tbuehrens commented 3 years ago

Agreed on all points Eric! I think we are getting somewhere! SAR can be constrained spatially!