MargaretSiple-NOAA / goa-ai-data-reports

Automate data reports for GOA and AI surveys
1 stars 1 forks source link

Create R script to produce "Table 3" #33

Closed MargaretSiple-NOAA closed 2 weeks ago

MargaretSiple-NOAA commented 1 year ago

Table 3 = for a given species and year, a summary by survey district and depth intervals of CPUE and Biomass. It looks like this: image

In documentation and in convo with Paul and Nate, the CPUE and biomass estimate tables for a given species are always called "Table 3" and "Table 4" even though these may not match their names in the formal report. I'm using that terminology here to be consistent, but just FYI Table 3 and Table 4 repeat for each species in the report by chapter.

I don't know where the SQL script is to produce Table 3 (or Table 4 for that matter), so I'm busting this issue into two tasks and assigning both Paul and Alex, though I think each of you will do one of these:

MargaretSiple-NOAA commented 1 year ago

OK Paul has provided the path to the instructions for producing Table 3: AI-GOA/Instructions&Procedures/Data Report/Table 3/Instructions CPUE tables by INPFC area and depth_REVISED OCT 12 2017

I have done the same for Table 4 and it looks fine, so this is the one species-specific table we need to do.

MargaretSiple-NOAA commented 1 year ago

Here is a rough sketch of how I think this table is produced. I believe Paul uses something like BIOMASS_INPFC_DEPTH, but it should be replicable from the tables in the AI and GOA schemas.

  1. Ingredients: probably just GOA.GOA_STRATA (which, counterintuitively, contains stratum information for all the strata in the GOA and the AI), and AI.BIOMASS_STRATUM.
  2. Filter GOA_STRATA to AI strata only, then left join that table to BIOMASS_STRATUM so that you have stratum areas for each stratum -- this is important because you'll use the areas to weight the biomass averages, as in this function.
  3. You want to calculate the mean biomass and the CI's for each district-depth combo. So for each district and depth, calculate the mean CPUE and biomass, with the mean weighted by the area of that stratum over the total area.
  4. For the confidence intervals, I truly don't know what to do. I forget how they are calculated for some of these summary tables. But! The code that Zack wrote for design-based-indices is analogous, so if you want to give it a go, it will be very similar to this.

I'm sorry I'm not more helpful! My brain is very foggy. I will try this again when I am not sick. In the meantime, even an RODBC::sqlquery() version is good. And that isn't even needed until next year!