ebuhle / LCRchumIPM

This is the development site for an Integrated Population Model for chum salmon in the lower Columbia River.
MIT License
4 stars 1 forks source link

freshwater covariates and juvenile production #17

Open kalebentley opened 1 year ago

kalebentley commented 1 year ago

Hey guys,

Over this past week, I’ve been working on generating juvenile abundance estimates for spring ’22 and trying to contextualize the estimate(s) for chum fry at the mainstem Grays River trap. Below is a brain dump of things I've been mulling over. Figured it was worth posting to get feedback and could potentially serve as a springboard for our efforts to better understand and characterize covariates that could be affecting freshwater survival.

OK - so without getting into the details of the juvenile estimates themselves, I’ve generated two separate preliminary estimates of abundance for Grays Basin chum fry in ’22 (i.e., chum passing the mainstem Grays trap). The first estimate (run 1) I generated is ~1M (CV = 10%) and the second (run 2) is ~1.5M (CV = 39%). Obviously, there’s a large relative difference here. But again, ignoring the details of how I derived these two estimates, if we “zoom out” and look at these results in the context of recruits per spawner (R-S) the difference is actually relatively small in comparison to expected outmigration given the large adult return in the fall of 2021 (see plot below).

image

As we can see above, despite the second-largest adult return to the entire Grays Basin in 2021, the [median] estimated outmigration in the spring of 2022 was one of the lowest in the 15 years of data (though I will highlight that the u95 of “run 2” is right on the imaginary B-H stock-recruit line my brain draws). So what could be causing this? Well, one factor is obviously the outmigration abundance of chum fry from Crazy Johnson (CJ), which is a subcomponent of the estimated abundance at the mainstem Grays trap.

image

In the R-S plot above, we can see that in the fall of 2021, CJ had its largest estimated adult return (~7.4K) in the 18-year data set (and the largest in the 11 years of the paired juvenile-adult dataset). Interestingly, we now have a 2nd data point suggesting that juvenile recruitment in CJ may be susceptible to overcompensation. But focusing back on the 2022 Grays Basin outmigration, the reduced outmigrant abundance in CJ doesn't entirely explain the lower-than-expected outmigration abundance at the Grays MS trap. So what else could be occurring?

Well, without getting my citations in order, we know that overwinter flows [during incubation] can affect egg-to-fry survival for mainstem spawning salmon. In the Grays Basin, I have no idea what the relationship is between flow/gauge height and bed load movement (e.g., scour depth) let alone the effects on incubation survival but l can plot some data...which perhaps is a start to understanding what could be going on here.

So I started by pulling stream gauge data from the Grays River station (DOE station #25B060) and wrangling using some script I've developed/refined over the years. Below is a plot of mean daily gauge height from Nov. 1 (Rel_Day = 1) to Mar. 15th (Rel_Day = 135) by brood year (2007 - 2021). I've added points to highlight when the maximum daily gauge height exceeds "flood stage" (orange; >12 ft.) and "major flood stage" (purple; >16 ft.) thresholds based on NOAA definitions specified here. No idea if these thresholds are biologically meaningful, but it is worth noting that the peak height on Jan. 7th, 2022, of 16.2' was the 2nd highest crest recorded at this gauge which goes back to 2005. Here's a photo of the highest-ever record flows on the Grays from December 2007 where the gauge height was 16.5'.

image

Having just generated this plot a few hours ago, I haven't had a lot of time to digest the within and among year patterns. Incubation "flows" for BY2021 were higher than any other year by simply counting the number of individual "flood" and "major flood" days. However, I am sure there's way more nuance with regard to when the flood events occurred and again threshold for bedload movement.

Regardless, I figured I'd take the easiest possible stab at evaluating the effects of high flows on outmigration abundance by plotting the maximum observed stage height for each incubation brood year as a predictor of estimated chum fry abundance the following year (see below). I realize I should be doing this simple evaluation on the residuals of the spawner-recruit function (but I didn't). Still, the results are fairly compelling and it is worth noting that a few of the data points that looked suspicious in the R-S plot above now make more sense, maybe.

image

One more thing about BY2021 flows - interestingly, flows in late January and most of February were near record lows. The plot below shows mean daily flows in BY2021 (red) relative to all other years (gray). So maybe some redds got scoured/displaced during high flows and then dewatered during low flows?

image

It's worth (re)sharing this short video that was captured by Peter Barber (Restoration Ecologist – Cowlitz Indian Tribe) in March 2022. In an email sharing this video, Peter wrote: "The flood or January 2022 transported and deposited massive amounts of sediment and I was surprised to see the changes in a reach I’ve surveyed for a couple of years. One of the survey highlights was finding buried chum fry in lower Shannon Creek and a section of a West Fork Grays side channel. I’ve never observed this before and initially believed this was the location of a buried chum redd from the transported sediment during the January flood. In hindsight, I believe redd scour occurred during high flows and these eggs were transported downstream and buried in a sediment aggradation zone. My only rationale, if it was a redd, we would’ve observed hundreds of juveniles bubbling out of the gravels versus just a few."

Lastly, I think it's worth acknowledging an "elephant in the room" when it comes to generating juvenile estimates of abundance. While the majority of the flood stage flow events on the Grays River appear to occur in Nov. thru Jan., we do see these large flow events around and during the peak outmigration of chum fry, which generally occurs in mid- to late March (Rel_Day ~135-150). These high-flow events often result in missed trap days. Our abundance estimator(s) assume our trapping days are representative of the missed days (side note: we can add flow covariates but we typically do not trap these really high flow events). Given that fry are small and thus relatively poor swimmers, it seems possible that "a lot" of fry get washed out during these high flow events and we simply underestimate abundance during these high flow events. But I can't resist the urge to argue with myself and add to our list of unknowns -- do any of these hypothetically displaced fry survive?? If not, maybe our estimates are unbiased. In spring 2022, there was a large flow event in late-February (max stage 15.7'!!) that resulted in five missed days...and is the reason why I ran two estimates, which I mentioned at the beginning of this now very long post. Certainly a topic for another day but definitely another factor that could be influencing the observed patterns of abundance, survival, etc.

tbuehrens commented 1 year ago

Other than Pete barber's speculation about eggs getting transported and surviving (bullshit), I think you're hitting a lot of the right points here...yes to covariates for freshwater recruitment anomalies!

ebuhle commented 1 year ago

Hey @kalebentley, thanks for this figuratively and literally deep dive! There's a lot here and I'm still digesting it, but my bottom line is that I agree with @tbuehrens that this is a promising avenue for developing covariates of freshwater recruitment. (I also share his skepticism about eggs surviving sediment transport; seems more likely there weren't more fry because most of them died when the redd got buried, but that's beside the point.)

As you note, there's certainly ample support in the literature for the effect of high flows on overwinter embryo / fry survival. Practically speaking, the next questions would be (1) data availability and (2) data transformation. There are lots of options for (2), e.g. number of days above some defined flood stage (as you show) or above Q10, or an annual upper quantile of gauge height, etc. But that's all moot unless we can find hydrology data with spatial and temporal coverage matching our population data. If DOE gauge 25B060 goes back to 2005-01-01, that leaves BY 2004 in the three Grays populations without (complete) covariate data. And then the same question applies to the other nine populations. Maybe there's sufficient spatial coverage that it would be possible to interpolate the missing locations / years from nearby gauges? And maybe someone in the region has even done this already???

Digging into the details, my first thought was (of course) that it would be interesting to look at these patterns using the estimated states and S-R functions from the IPM. The plots shown in the Retrospective Models: Spawner-Recruit Functions section of the 2022 vignette are not directly comparable to your plots for the mainstem Grays because, as I note in the text, the S-R function and states as well as the observed spawners S_obs refer to the Grays MS population itself whereas the observed juvenile outmigrants M_obs are of course the sum of production from the three spawning areas upstream of the mainstem trap. However, comparing your Grays CJ figure to mine, I realized I made an embarrassing error in those plots -- S_obs and M_obs are aligned by calendar year, as in fish_data, when they should be aligned by brood year. D'oh! I'll fix that, but the population-specific plot for Grays MS still won't be comparable to your basin-wide plots of spawners and juvenile outmigrants. We could easily emulate the latter by working directly with the corresponding states. All that said, though, the estimated* observation errors and thus the posterior uncertainty in the states are so small that the patterns won't look dramatically different from those seen in the data.

The more interesting question is the one you raise about looking at residuals of the S-R curves. There is of course no parametric "S-R function" at the level of the entire Grays basin, because it combines three distinct populations. But we already have the population-specific S-R residuals aka smolt productivity anomalies, shown as time series in panel G here. There's not much population-level heterogeneity around the shared anomalies, although as we've discussed before, the caveat is that these estimates (and likewise SAR) are extrapolated from the 5/12 populations with direct outmigrant observations. What's interesting is that log freshwater productivity appears much more stable than logit SAR (as we can see by comparing the estimates of sigma_year_M and sigma_M vs. sigma_year_MS and sigma_MS) except for one wild outlier in BY 2019. That appears to be driven by the super-low 2020 juvenile outmigration in all four of the available data series, which corresponds to the yellow point for Grays CJ in @kalebentley's plots. So if that picture-perfect relationship between discharge and outmigrants holds up, then accounting for discharge as a covariate could really improve our estimates of both freshwater productivity and SAR. (Same goes for covariates of SAR, if we can find any with strong "effects".)

* I say "estimated" advisedly. @kalebentley's point about the elephant in the room vis a vis potentially unquantified observation error in M_obs caused by high flow is well taken, and reinforces my periodic handwringing about whether our strikingly low estimates of tau_M_obs and/or tau_S_obs are actually over-optimistic.

tbuehrens commented 1 year ago

@ebuhle @kalebentley I think the first (obvious) thing to realize is that survival residuals don't necessarily have a linear relationship with flow, nor is the non-linear relationship the same between different areas--e.g., I'd expect very high average flows during spawning and incubation to increase survival in off-channel habitat (more groundwater, more surface flow so more fish can get into channel, more suitable spawning areas)...alternatively, way below average flows for prolonged periods in off channel habitat could lessen egg to fry survival, and in the extreme cause adults not to even use the channels. On the other hand in the mainstem, i'd expect average flows have less of an effect than extreme high flows which mobilize bedload and scour/bury redds.

ebuhle commented 1 year ago

@tbuehrens, good point. Any thoughts on how to model these context-dependent responses at the population level, given the data we have in hand (or could have in hand, if adequate hydrology data are available)?

kalebentley commented 1 year ago

Thanks @tbuehrens & @ebuhle for your fast feedback. I'll take a quick stab at responding to your main comments/question:

As you note, there's certainly ample support in the literature for the effect of high flows on overwinter embryo / fry survival. Practically speaking, the next questions would be (1) data availability

As you all are aware, there's a network of USGS and DOE "stream" ga(u)ges throughout lower Columbia but there isn't a unique gauge for every chum population. However, I would be surprised if someone/some group hasn't developed flow predictions for all major drainages based on the (limited) number of monitoring gages. I haven't attempted a search for this yet but certainly the most logical place to start.

I know NOAA and NWS have a Northwest River Forecast Center but I don't understand how it "works". For instance, when you go to their webpage it only shows locations with USGS gages. However, as I highlighted in my original post, predictions are generated (periodically?) for Grays River using the DOE gauge data. What I can't tell is if they only made predictions for locations with gauge data (either USGS or DOE) or other locations based on say landscape characteristics. I tried searching around their website a bit to see if I could find a complete list of locations where predictions are made but couldn't easily (their website kind of sucks).

I certainly do not want to reinvent the wheel on something that seems like it should exist so again first thing will be to do more digging or reach out to colleagues for leads. That said, as it pertains to @ebuhle question/comment about the time series of Grays Basin flow data not spanning the timeseries of abundance. There's a USGS gauge on the Naselle River that is the watershed directly north of the Grays Basin. It looks like the Naselle gauge height dataset goes back to 2007 but (oddly) the discharge dataset goes back to 1987. A few years ago, I did a quick comparison between Naselle and Grays' data and they were highly correlated.

As for

(2) data transformation.

I don't have a lot to add to your initial suggestions except that I would probably start simple. I definitely appreciate @tbuehrens comments regarding non-linear relationships and different effects among populations. However, as we discussed many times, we have a relatively small number of overall populations and an even smaller number of ones with juvenile data (that are weighted heavily towards off-channel areas) so not sure if we have a lot of options here. Maybe start with categorically defining populations into two categories (off-channel/artificial and mainstem including tribs & Col. River) and looking at interactions with a few flow-related covariates? Related...

The more interesting question is the one you raise about looking at residuals of the S-R curves...except for one wild outlier in BY 2019. That appears to be driven by the super-low 2020 juvenile outmigration in all four of the available data series,

As for BY2019, my recollection was that flows were extremely low throughout the lower Columbia Basin and can be seen in my original Grays gauge height plot here https://user-images.githubusercontent.com/40397762/199361249-466648ff-689e-4cbf-9ac0-bf5973c865fa.png. I know this really affect adults being able to access CJ and Hamilton Springs during Nov. & early Dec. and remember the Duncan channel being low throughout incubation. When I glance at the FW anomalies and my R-S plot for Grays_Total, the effect was large for these off-channel populations but not so much for Grays_Total. So again, as Thomas hypothesized, these low flows may be larger factors for off-channel populations. However, as Eric and I just discussed on the phone, the timing of the low flows and specific population probably matter. For instance, if low flows happen early like in BY2019 and adults can't access certain spawning areas, do they go spawn elsewhere? Maybe/probably - do our datasets capture this (yes for Grays, no for Hamilton) let alone trying to account for "downstream" effects (e.g., fewer spawners in CJ but more in WF/MS). Regardless, I think we start simple.

  • I say "estimated" advisedly. @kalebentley's point about the elephant in the room vis a vis potentially unquantified observation error in M_obs caused by high flow is well taken, and reinforces my periodic handwringing about whether our strikingly low estimates of tau_M_obs and/or tau_S_obs are actually over-optimistic.

Freshwater covariates aside, I'm wondering if it's worth going back and re-evaluating our juvenile datasets - at least for Grays and CJ - with the goal of reassessing the estimated uncertainty. Again without diving into details too much here...over the past few years, I've slightly modified my approach to generating estimates of abundance for chum fry. In short, I used to stratify the dataset into 3-7 days periods based on when efficiency trials were conducted. Now, I generated daily estimates and am more mindful of censoring questionable data. Although this may not totally address my original point about the representativeness of observed vs. miss-trapping days but could maybe increase estimates of uncertainty.

tbuehrens commented 1 year ago

flow script.zip Hey Kale, NWRFC only makes future predictions (which aren't that good btw). What you really need for this project is "interpolations" between all the missing data. Fortunately, I wrote a script a while ago to do this: 2 functions to pull in flow data (ecology or USGS) and then fit simple MARSS models to interpolate missing or bad values...see attached....the only thing I haven't done here is "re-center" and "re-scale" the flow data. I fit the model on the logged and z-scored flow. In reality, logged and z-scored flow may be perfectly suitable for our purposes anyway. image

tbuehrens commented 1 year ago

also, I should mention that it has been my belief that DOE flow guages are not as carefully maintained and therefore suffer bigger QA/QC issues than USGS ones...i'm not sure I believe how low the grays flows got in the summer the two most recent years for example (see top plot in html where they clearly look like outliers)....wouldn't be too hard to "throw some additional data out" and refit if you cared about summer flow (which I doubt you really do).

tbuehrens commented 1 year ago

i went ahead and put that code on github in case useful: https://github.com/tbuehrens/streamflow_tools

Hillsont commented 1 year ago

As an alternative to the Grays DOE gauge, you could use the USGS gauge on the Naselle. The basins are next to each other, and it has a longer time series. I learned to get a feel for what's happening in the Grays using the Naselle gauge, back before DOE put a gauge on the Grays. As far as predictions on the Grays, Advanced Hydrologic Predictions (https://water.weather.gov/ahps2/index.php?wfo=pqr) only shows a prediction for the Grays when it's forecasted to reach a flood stage. Another reason to use the Naselle, it always has a 7-day forecast.

kalebentley commented 1 year ago

Cool @tbuehrens. Looks like your code will help fill in any missing or bad gaps within an existing time series. A couple of follow-up questions: 1.) It's been long enough since I've messed around with MARSS models that I don't remember how well they can fit outside a data set. For instance, could it predict flows/gauge height for Grays River in BY 2004 if we had 1+ datasets that go back that far (reminder: DOE Grays gauge only goes back to BY2005)?
2.) Did you have thoughts on flow data we could use for populations/basins without a gauge (e.g., Duncan Springs, Hardy Creek, and many more once we add more pop'ls)?

Hillsont commented 1 year ago

I wouldn't be surprised to see a sharp change in CJ productivity related to flow. I haven't been down there in a while to see changes but the overland cut/ connection to the MS activates at a certain flow.

tbuehrens commented 1 year ago

1) kale, i improved the model since i last posted it (see my github page v4) 2) yes, we can easily predict BY 2004 if we have flow data from at least a few of the timeseries (e.g. Naselle, Chehalis)

Hillsont commented 1 year ago

Also, I'm probably not going to say this right, I wouldn't expect CJ productivity to match/ track spawners. CJ has a very limited amount of spawning habitat, even after the addition of the spawning channel. What I remember is when adult abundance increases, they start pushing out into less than ideal habitat, e.g. up into CJ proper and the overland cut to the MS Grays.

ebuhle commented 1 year ago

Very helpful discussion so far, thanks everyone. @tbuehrens, this MARSS approach looks really cool; thanks for putting it on GH. I haven't dug into the code deeply enough yet, but given the degree of spatial and temporal autocorrelation in flow, e.g. as seen in your example above, I'd imagine there are enough ga(u)ges distributed throughout our domain of interest that we could do a pretty good job of interpolating those unmonitored basins @kalebentley mentioned. Also, the predictions do of course come with error estimates. I wonder if it would be worth exploring a spatially structured Q matrix to better reflect the increasing correlation between nearby watersheds vs. distant ones.

Also, I'm probably not going to say this right, I wouldn't expect CJ productivity to match/ track spawners. CJ has a very limited amount of spawning habitat, even after the addition of the spawning channel. What I remember is when adult abundance increases, they start pushing out into less than ideal habitat, e.g. up into CJ proper and the overland cut to the MS Grays.

Yeah, this is tricky. On the one hand, the behavioral mechanisms @Hillsont describes here could simply manifest as compensation in the spawner-recruit relationship. For example, Walters's foraging-arena theory derives the Beverton-Holt model from assumptions about habitat selection trade-offs as a function of density (albeit related to trophic interactions rather than spawning site quality). However, the possibility that less-than-optimal habitat might actually spill over into another "population" adds a wrinkle that we can't readily model.

And then there's the separate issue that high flows may actually increase FW productivity in some systems by opening up natural or artificial off-channel habitat. It would be easy enough to check whether this shows up in the CJ-specific relationship between max discharge and outmigrant abundance (vs. the basin-wide relationship shown in @kalebentley's OP).

Freshwater covariates aside, I'm wondering if it's worth going back and re-evaluating our juvenile datasets - at least for Grays and CJ - with the goal of reassessing the estimated uncertainty. Again without diving into details too much here...over the past few years, I've slightly modified my approach to generating estimates of abundance for chum fry. In short, I used to stratify the dataset into 3-7 days periods based on when efficiency trials were conducted. Now, I generated daily estimates and am more mindful of censoring questionable data. Although this may not totally address my original point about the representativeness of observed vs. miss-trapping days but could maybe increase estimates of uncertainty.

I for one would be interested in further exploring this.

Hillsont commented 1 year ago

@ebuhle. Would a site visit be beneficial? I know for me firsthand observations are better than any text descriptions. The chum are starting to arrive in the Grays, they should peak near the end of this month.

ebuhle commented 1 year ago

@Hillsont, I wholeheartedly agree and would love to visit the Grays (or any of these watersheds, for that matter) during the chum run. Unfortunately I can't really hack it in the field anymore thanks to post-surgical crippledness. The last time I tried was six years ago on the Cedar and I nearly drowned, haha (story for another time). Maybe someday...

Hillsont commented 1 year ago

@ebuhle, sorry to hear about limitations. While Grays and CJ are not an option (hiking miles of stream banks to get to where you want to be, there are places like Duncan and Hamilton Springs spawning channels that are a drive to and just short walks on paths / level ground. Joining one of our seining crews for a boat ride to see the mainstem Columbia River spawning areas and our operations there (beach seining via boat, broodstock collection, live tagging and recap activities) is also an option if you're interested.

kalebentley commented 9 months ago

Hey guys,

I wanted to provide a very brief update on this topic. I started working on estimates of abundance for juvenile chum outmigrants in spring 2023 this week. My preliminary estimates are ~1.2 M for CJ and ~1.7 M for Grays Basin (i.e., MS + WF + CJ).

Adding these estimates to the plots I generated last year, we can see that the total outmigration (1.7M) is less than what we would perhaps expect given the total abundance of adults the previous fall (~13K).
image

As I showed last year, the outmigration of chum fry from CJ in 2023 (1.2 M) does match expectations (of my eyed B-H or Ricker curve) given the abundance of adults in the fall of 2022 (~3.3K) image

Adding one new plot, here's a plot of juvenile outmigrants from the MS & WF (Basin-wide minus CJ) as a function of adult abundance in MW & WF the previous fall. Ignoring the two "outliner" data points from 2016 & 2018, the data certainly look more like a shotgun blast where we see roughly the same number of outmigrants whether there were 3K spawners or 10K: image

So what's going on here? Well, I don't have any additional hypothesis beyond what I wrote up last year but here's an updated plot showing the relationship between the maximum observed Grays River gauge height during incubation (Nov. 1 - mid. March) and the resulting juvenile outmigration (NOTE: this plot shows juvenile abundance for MS+WF): image

Again, as I also highlighted last year, I fully acknowledge that a more appropriate evaluation/plot would be looking at the residuals from a stock-recruit fit as a function of flow (and a more in-depth evaluation of flow timing, thresholds, etc.) but based on my quick eyeballing of the last two plots here it certainly seems like the negative relationship holds.

One last thing worth highlighting is that the max. gauge height for return year 2022 came in early November, which I believe is before the typical peak of spawning in the basin. I don't have any information on hand to confirm when peak spawning occurred in 2022...but certainly a topic worth exploring more as we dig into what's happening here.
image

If this relationship continues to hold, the outmigration this coming spring (2024) will again be lower than "expected" as the Grays River reached "major flood stage" earlier this week (and if the preliminary reading holds, it would be the 2nd highest reading in the basin since 2005):
image

I know we already realize the importance of off-channel habitat for chum but seeing these data, and knowing that big winter storms are likely to only become more common, really emphasizes it for me.