afsc-gap-products / gap_products

This repository supports code used to create tables in the GAP_PRODUCTS Oracle schema. These tables include the master production tables, tables shared with AKFIN, and tables publicly shared on FOSS.
https://afsc-gap-products.github.io/gap_products/
Creative Commons Zero v1.0 Universal
5 stars 5 forks source link

Edits and improvements to GAP_PRODUCTS.DESIGN_YEAR #23

Closed EmilyMarkowitz-NOAA closed 4 months ago

EmilyMarkowitz-NOAA commented 5 months ago

Issue Description

1. Content: Shouldn't DESIGN_YEAR in the GAP_PRODUCT.SURVEY_DESIGN table be the year that year's area calculations were calculated in? Here, all EBS entries have DESIGN_YEARs of 2022 even for YEARs before 2022 (which is impossible). For example and if I recall correctly, design-based estimates calculated in 2023 and 2022 were calculated with area estimates where DESIGN_YEAR = 2022, but estimates from 2021 thru 2019 were originally estimated with the area calculations produced in 2019, 2018-2010 with 2010, 2009-1993 with 1993, and so on. This will be an issue with NBS and (I believe) BSS, too.

image

2. Description: Stock assessors asked for some more direction and documentation about how to use/impliment the SURVEY_DESIGN table. Here is what is currently listed:

image

### Tasks
- [ ] Fix the years listed in the `GAP_PRODUCTS.SURVEY_DESIGN.DESIGN_YEAR` column. _I've added a new column to the table in [future oracle](https://docs.google.com/spreadsheets/d/1wgAJPPWif1CC01iT2S6ZtoYlhOM0RSGFXS9LUggdLLA/edit?usp=sharing) table temporarily called `DESIGN_YEAR1` with what I think the years should be (referencing `GAP_PRODUCTS.AREA.DESIGN_YEAR and RACEBASE.STRATUM`). Edit and use this column to update the `DESIGN_YEAR` column when everything looks right and if these changes are appropriate._ 
- [ ] There are years of the NBS (`SURVEY_DEFINITION_ID` = 143) that are listed in the `DESIGN_YEAR` table but were never conducted. _I have highlighted these years in red and struckout the text. Please go in and delete them if you agree with these changes._ Are there any other surveys listed there that were never conducted? 
- [ ] Should 2023 Bering Sea slope (`SURVEY_DEFINITION_ID` = 78) be added to the `DESIGN_YEAR` table (as is currently in the `AREA` table). If we are not doing a BSS survey in 2023, does it get added to DESIGN_YEAR? _I've added these to the [future oracle](https://docs.google.com/spreadsheets/d/1wgAJPPWif1CC01iT2S6ZtoYlhOM0RSGFXS9LUggdLLA/edit?usp=sharing) table in green for your review._ 
- [ ] Change `DESIGN_YEAR` description from to "The year that the AREA_KM2 of the AREA_ID was calculated. AREA_ID areas are recalculated as needed with better bathymetric and land-water boundary data. " Or something like this. 
- [x] `SURVEY` should be deleted as it is redundant to SURVEY_DEFINITION_ID. _I've deleted this column from the [future oracle](https://docs.google.com/spreadsheets/d/1wgAJPPWif1CC01iT2S6ZtoYlhOM0RSGFXS9LUggdLLA/edit?usp=sharing) table and should update directly in the next data product run._ 
- [ ] More documentation in the table description about how this table should be used to select the `DESIGN_YEAR` in `GAP_PRODUCTS.AREA`/`AKFIN_AREA` should be included. In addition to what is here, we should add something like: `"For a given YEAR of the survey, use the associated DESIGN_YEAR to recreate the design-based estimates with the stratum areas of that survey YEAR. DESGIN_YEAR is survey specific. "` after the first sentence (or something like it).
- [ ] @Duane-Stevenson-NOAA can you also check my new designations for the new and temporary `DESIGN_YEAR1` column?
Duane-Stevenson-NOAA commented 5 months ago

I feel like I need more context on this. The "content" portion of this issue seems like a misunderstanding about the fact that we recalculate all the biomass, agecomps, sizecomps for all years after each new survey year. So it's not impossible that the 1999 biomass estimates were calculated with DESIGN_YEAR = 2022. Thus, I'm not sure why we need a new column or an update to the DESIGN_YEAR column. Em, maybe we should chat about this.

Duane

On Wed, Feb 7, 2024 at 6:35 PM Em Markowitz (NOAA) @.***> wrote:

Assigned #23 https://github.com/afsc-gap-products/gap_products/issues/23 to @Duane-Stevenson-NOAA https://github.com/Duane-Stevenson-NOAA.

— Reply to this email directly, view it on GitHub https://github.com/afsc-gap-products/gap_products/issues/23#event-11740119221, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANKDWAVFACXAAAC2AGLOF5TYSQ2WJAVCNFSM6AAAAABC63NMZ6VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRG42DAMJRHEZDEMI . You are receiving this because you were assigned.Message ID: @.*** com>

-- Duane Stevenson, Ph.D. Supervisory Fish Biologist Groundfish Assessment Program NMFS, Alaska Fisheries Science Center

Ned-Laman-NOAA commented 5 months ago

I had the impression from the conversation in the virtual room that much of the confusion about DESIGN_YEAR had to do with the introduction of a lookup field that the users didn't anticipate having to leverage and was less about which design year's stratum areas you used to do your computations. That said, I would suggest that both concepts, the one Duane raised and the one I raised, be explored the next time we get the team together in Office Hours.

EmilyMarkowitz-NOAA commented 4 months ago

Sounds good. It'll be good to table this until @zoyafuso-NOAA can join in on this discussion, too!

zoyafuso-NOAA commented 4 months ago

@Duane-Stevenson-NOAA is correct in that all survey years' indices are calculated with the 2022 version of the EBS stratum areas. So the DESIGN_YEAR value for YEAR = (1982-2023) for the EBS SURVEY_DEFINITION_ID = 98 should all be DESIGN_YEAR=2022. It sounds redundant but this format is important when the GOA survey time series goes from DESIGN_YEAR=1984 for YEAR = (1990-2023) and DESIGN_YEAR=2025 for YEAR = (2025-onwards) due to the restratificaiton.

This table is definitely used internally within gapindex and is the main reason for existing. I was probably not privy to this conversation at the last office hours so I am not sure how this table is used outside of that. It seems like folks want to query the GAP_PRODUCTS.AREA with just the appropriate stratum records used in the index calculation?

@EmilyMarkowitz-NOAA you’re right about the NBS years, thanks for cleaning that up. For the slope, there shouldn't be a record for the BSS with YEAR = 2023 because there wasn't a slope survey in 2023. However, when the time is appropriate, the value for DESIGN_YEAR for the BSS years will be updated from 2002 -> 2023.

EmilyMarkowitz-NOAA commented 4 months ago

Oh thanks for clearing that up, Zack. So... I think I am the one who was confused about what this table was for and suspect I have likely mislead folks/started this confusion. I thought it was documenting what stratum calculations estimates were previously calculated with (which I now see is wrong). We'll just clear it up at the next office hours I suppose? But glad I was able to fix a few other finds! @zoyafuso-NOAA, you'll go through everything here and implement stuff as needed and close the issue? Or is there anything else I can provide/help with?

Also, we currently make the AKFIN_SURVEY_DESIGN table available to AKFIN. Do we want to remove that table from the dump if it is internal/should only be needed in gapindex?

zoyafuso-NOAA commented 4 months ago

I think for now we can still provide the SURVEY_DESIGN table to AKFIN and continue discussions with users as to whether this table will still be useful to them before removing the table from the AKFIN table transfer.