NOAA-EDAB / tech-doc

Technical documentation for ecosystem reporting
https://noaa-edab.github.io/tech-doc
Other
9 stars 10 forks source link

[Submission]: Zooplankton Indices #114

Open sgaichas opened 3 weeks ago

sgaichas commented 3 weeks ago

Data Source(s)

The Northeast Fisheries Science Center has conducted zooplankton surveys since the 1970s.

The dataset through 2022 used in this analysis was obtained from Harvey Walsh.

Data Processing

Data were aggregated into groups for analysis (names indicate column names in the dataset linked above):

A lookup of these column headings is here: https://www.fisheries.noaa.gov/inport/item/35054

Data were assigned to seasonal blocks

Years were selected for analysis

1982 - 2022

The script that processes input data is available here: https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/data/VASTzoopindex_processinputs.R

Data Analysis

VAST models

VAST is a spatio-temporal modeling framework used for index standardization [@thorson_comparing_2017; @thorson_guidance_2019].

Zooplankton models were evaluated using two stages of model selection to determine whether to include

  1. spatial and spatio-temporal random effects, and

  2. "catchability" covariates affecting the observation process: day of year.

Model selection script is https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_modselection.R

Models were run using REML and without bias correction.

Two different observation models were applied. The default VAST index standardization (purpose = "index2" in make_settings) uses a Gamma distribution for positive catches and an alternative "Poisson-link delta-model" using log-link for numbers-density and log-link for biomass per number (ObsModel= c(2,1)).

We applied the default observation model to Calanus finmarchicus (calfin_100m3) and Large copepods (calfin_100m3 + mlucens_100m3 + calminor_100m3 + euc_100m3 + calspp_100m3) datasets.

The default was used for index standardization of stomach contents data for pelagic and benthic forage indices. It is intended for continuous data, which includes biomass data and "numbers standardized to a fixed area" (see section starting at line 239 in the VAST user manual here. I am interpreting zooplankton abundance per 100 cubic meters as numbers standardized to a fixed area (volume) in applying the Gamma observation model.

For data where there are some years where the species is present in all (or 0) samples, estimating the probability of encounter fails (or at least, VAST won't let you try). In these cases, the options are to treat intercepts representing temporal variability as random effects (by setting RhoConfig Beta or Epsilon entries to something other than 0), or to use a different link function.

We intend for our indices to potentially be used in assessment (though as a covariate rather than an index, so maybe I'm being too strict), so the recommendation is to "minimize covariance in the estimated index by excluding any temporal correlation on model components (i.e., the intercept is a fixed effect in each year, and the spatio-temporal term is independent in each year)" [@thorson_guidance_2019].

All of the small copepods datasets had at least one year where our small copepods groupings were encountered at all stations. None had years with 0 encounters.

Therefore, we used a different link function, the Poisson-link fixing encounter probability=1 for any year where all samples encounter the species. We kept all other settings for index standardization the same, but set (ObsModel= c(2,4)).

Model selection results are reported at this link: https://noaa-edab.github.io/zooplanktonindex/CopeModSelection.html

In the second stage of model selection, the day of year covariate had mixed success. Results of the best models which both converged and had the lowest AIC are reported here: https://noaa-edab.github.io/zooplanktonindex/CopeModResults.html

Bibiography

https://github.com/Article{thorson_comparing_2017, title = {Comparing estimates of abundance trends and distribution shifts using single- and multispecies models of fishes and biogenic habitat}, volume = {74}, issn = {1054-3139}, url = {https://doi.org/10.1093/icesjms/fsw193}, doi = {10.1093/icesjms/fsw193}, abstract = {Several approaches have been developed over the last decade to simultaneously estimate distribution or density for multiple species (e.g. “joint species distribution” or “multispecies occupancy” models). However, there has been little research comparing estimates of abundance trends or distribution shifts from these multispecies models with similar single-species estimates. We seek to determine whether a model including correlations among species (and particularly species that may affect habitat quality, termed “biogenic habitat”) improves predictive performance or decreases standard errors for estimates of total biomass and distribution shift relative to similar single-species models. To accomplish this objective, we apply a vector-autoregressive spatio-temporal (VAST) model that simultaneously estimates spatio-temporal variation in density for multiple species, and present an application of this model using data for eight US Pacific Coast rockfishes (Sebastes spp.), thornyheads (Sebastolobus spp.), and structure-forming invertebrates (SFIs). We identified three fish groups having similar spatial distribution (northern Sebastes, coastwide Sebastes, and Sebastolobus species), and estimated differences among groups in their association with SFI. The multispecies model was more parsimonious and had better predictive performance than fitting a single-species model to each taxon individually, and estimated fine-scale variation in density even for species with relatively few encounters (which the single-species model was unable to do). However, the single-species models showed similar abundance trends and distribution shifts to those of the multispecies model, with slightly smaller standard errors. Therefore, we conclude that spatial variation in density (and annual variation in these patterns) is correlated among fishes and SFI, with congeneric fishes more correlated than species from different genera. However, explicitly modelling correlations among fishes and biogenic habitat does not seem to improve precision for estimates of abundance trends or distribution shifts for these fishes.}, number = {5}, urldate = {2021-11-04}, journal = {ICES Journal of Marine Science}, author = {Thorson, James T. and Barnett, Lewis A. K.}, month = may, year = {2017}, pages = {1311--1321}, file = {Full Text PDF:/Users/sarahgaichas/Zotero/storage/BDBIBD5D/Thorson and Barnett - 2017 - Comparing estimates of abundance trends and distri.pdf:application/pdf;Snapshot:/Users/sarahgaichas/Zotero/storage/F62SPRTP/2907795.html:text/html}, }

https://github.com/Article{thorson_guidance_2019, title = {Guidance for decisions using the {Vector} {Autoregressive} {Spatio}-{Temporal} ({VAST}) package in stock, ecosystem, habitat and climate assessments}, volume = {210}, issn = {0165-7836}, url = {http://www.sciencedirect.com/science/article/pii/S0165783618302820}, doi = {10.1016/j.fishres.2018.10.013}, abstract = {Fisheries scientists provide stock, ecosystem, habitat, and climate assessments to support interdisplinary fisheries management in the US and worldwide. These assessment activities have evolved different models, using different review standards, and are communicated using different vocabulary. Recent research shows that spatio-temporal models can estimate population density for multiple locations, times, and species, and that this is a “common currency” for addressing core goals in stock, ecosystem, habitat, and climate assessments. I therefore review the history and “design principles” for one spatio-temporal modelling package, the Vector Autoregressive Spatio-Temporal (VAST) package. I then provide guidance on fifteen major decisions that must be made by users of VAST, including: whether to use a univariate or multivariate model; when to include spatial and/or spatio-temporal variation; how many factors to use within a multivariate model; whether to include density or catchability covariates; and when to include a temporal correlation on model components. I finally demonstrate these decisions using three case studies. The first develops indices of abundance, distribution shift, and range expansion for arrowtooth flounder (Atheresthes stomias) in the Eastern Bering Sea, showing the range expansion for this species. The second involves “species ordination” of eight groundfishes in the Gulf of Alaska bottom trawl survey, which highlights the different spatial distribution of flathead sole (Hippoglossoides elassodon) relative to sablefish (Anoplopoma fimbria) and dover sole (Microstomus pacificus). The third involves a short-term forecast of the proportion of coastwide abundance for five groundfishes within three spatial strata in the US West Coast groundfish bottom trawl survey, and predicts large interannual variability (and high uncertainty) in the distribution of lingcod (Ophiodon elongatus). I conclude by recommending further research exploring the benefits and limitations of a “common currency” approach to stock, ecosystem, habitat, and climate assessments, and discuss extending this approach to optimal survey design and economic assessments.}, language = {en}, urldate = {2020-02-24}, journal = {Fisheries Research}, author = {Thorson, James T.}, month = feb, year = {2019}, keywords = {Climate vulnerability analysis, Distribution shift, Habitat assessment, Index standardization, Integrated ecosystem assessment, Spatio-temporal model, Stock assessment, VAST}, pages = {143--161}, file = {ScienceDirect Full Text PDF:/Users/sarahgaichas/Zotero/storage/38KBWBLZ/Thorson - 2019 - Guidance for decisions using the Vector Autoregres.pdf:application/pdf;ScienceDirect Snapshot:/Users/sarahgaichas/Zotero/storage/85BILR75/S0165783618302820.html:text/html}, }