afsc-gap-products / gap_products

This repository supports code used to create tables in the GAP_PRODUCTS Oracle schema. These tables include the master production tables, tables shared with AKFIN, and tables publicly shared on FOSS.
https://afsc-gap-products.github.io/gap_products/
Creative Commons Zero v1.0 Universal
5 stars 5 forks source link

Remove crab estimates from groundfish production tables? #3

Closed EmilyMarkowitz-NOAA closed 7 months ago

EmilyMarkowitz-NOAA commented 9 months ago

I was looking through the AKFIN production tables and realized that we include crab estimates in our public-facing data sets (see reproduced code chunks below).

I propose...

Short-term: Remove crab biomass/population/mean cpue estimates from AKFIN tables

I think we should remove these estimates from AKFIN because: 1) the crab team prepares their own estimates for AKFIN 2) including these estimates would be inaccurately redundant. We do not apply the same calculations and area summaries for the crab estimates as the crab team does.

The only big downside of removing the crab estimates (as far as I can tell) is that users will no longer be able to calculate the total biomass for the survey.

I think we can keep data in FOSS (and therefore also keep these data in the standard production CPUE table). Back when we initially created the FOSS data, we got the OK from Mike Litzow to include crab data. These estimates shouldn't be too far off since these are station-level estimates.

Long-term: Pull in crab data into production/akfin tables or integrate crab calculation

Perhaps, in the future, we can pull and bind our estimates with crab estimates. In the northern Bering Sea community highlights, I pull crab.gap_ebs_nbs_abundance_biomass (see example) and crab.gap_ebs_nbs_crab_cpue (see example) into my biomass and CPUE tables. These tables don't have all of the data we need for this task, but could be helpful in thinking through this issue.

Tagging @ShannonHennessey for awareness. Feel free to share your thoughts as you start to explore the crab schematas!

Here are the crab estimates currently in our tables: If it needs to be said, there are no crab estimates in our sizecomp or agecomp tables.

> gap_products_akfin_cpue0 %>% 
+     dplyr::filter(species_code %in% c(69323, 69322, 68580, 68560))
# A tibble: 124,113 × 7
   hauljoin species_code weight_kg count area_swept_km2 cpue_kgkm2 cpue_nokm2
      <dbl>        <dbl>     <dbl> <dbl>          <dbl>      <dbl>      <dbl>
 1   -21974        68560         0     0         0.0269          0          0
#  124,112 more rows
#  Use `print(n = ...)` to see more rows

> gap_products_akfin_biomass0 %>% 
+     dplyr::filter(species_code %in% c(69323, 69322, 68580, 68560))
# A tibble: 4,528 × 16
   survey_definition_id area_id species_code  year n_haul n_weight n_count n_length cpue_kgkm2_mean cpue_kgkm2_var cpue_nokm2_mean
                  <dbl>   <dbl>        <dbl> <dbl>  <dbl>    <dbl>   <dbl>    <dbl>           <dbl>          <dbl>           <dbl>
 1                   98      10        68560  1987     58       29      29        0          165.         5920.             1065. 
#  4,527 more rows
#  5 more variables: cpue_nokm2_var <dbl>, biomass_mt <dbl>, biomass_var <dbl>, population_count <dbl>, population_var <dbl>
#  Use `print(n = ...)` to see more rows
zoyafuso-NOAA commented 9 months ago

Hi Emily,

Sounds good to me, here are the SQL calls that would quick-fix remove those SPECIES_CODE values from the GAP_PRODUCTS.BIOMASS and GAP_PRODUCTS.AKFIN_BIOMASS tables:

"DELETE FROM GAP_PRODUCTS.BIOMASS WHERE SPECIES_CODE IN (69323, 69322, 68580, 68560)"

"DELETE FROM GAP_PRODUCTS.AKFIN_BIOMASS WHERE SPECIES_CODE IN (69323, 69322, 68580, 68560)"

I will leave you to do that at your leisure. The long fix is I will filter out these species codes after calculating biomass in future GAP_PRODUCTS production runs.

EmilyMarkowitz-NOAA commented 9 months ago

Sounds good. Secondary thought - do we only want to apply that deletion/filtering out for eastern and northern Bering Sea surveys? Just to make sure, @ShannonHennessey - your team does not provide estimates for the Aleutian Islands or Gulf of Alaska, right? And it would still be ok for us to produce those estimates for crab in those areas? For the record, I don't think we catch many of these species in those areas.

ShannonHennessey commented 9 months ago

That's correct, we just do the eastern and northern Bering Sea. I would imagine it's ok for you to keep producing estimates for crab in the Aleutians and Gulf of Alaska, especially because there won't be any duplicate or conflicting estimates on our end. I'm also happy to double check with Mike if that would be helpful!

EmilyMarkowitz-NOAA commented 9 months ago

Super! Thanks for confirming, Shannon. Initiating short-term plan now, keen for long-term plan soon!

zoyafuso-NOAA commented 9 months ago

NBS/EBS crab data (SPECIES_CODE values 69323, 69322, 68580, 68560) have been removed from GAP_PRODUCTS.BIOMASS and GAP_PRODUCTS.AKFIN_BIOMASS and will not be included in future production runs of GAP_PRODUCTS.

EmilyMarkowitz-NOAA commented 9 months ago

NBS/EBS crab data (SPECIES_CODE values 69323, 69322, 68580, 68560) have been removed from GAP_PRODUCTS.BIOMASS and GAP_PRODUCTS.AKFIN_BIOMASS and will not be included in future production runs of GAP_PRODUCTS.

Just for Bering Sea, right? (Just confirming)

zoyafuso-NOAA commented 9 months ago

Only EBS and NBS records for those SPECIES_CODE values were removed. Bering Sea Slope, Aleutian Islands, and Gulf of Alaska records for those SPECIES_CODES values are still in GAP_PRODUCTS.

EmilyMarkowitz-NOAA commented 7 months ago

I just realized and believe that 68590 Chionoecetes hybrid Tanner crab may also need to be removed from our data. @ShannonHennessey can you double check that your team provides estimates for these? If SAP provides estimates for these, we'll need update our scripts to remove this species. Are there any others that I forgot? Thanks!

ShannonHennessey commented 7 months ago

We do provide estimates for 68590 Chionoecetes hybrids!

On Wed, Nov 29, 2023 at 7:39 PM Em Markowitz (NOAA) < @.***> wrote:

I just realized and believe that 68590 Chionoecetes hybrid Tanner crab may also need to be removed from our data. @ShannonHennessey https://github.com/ShannonHennessey can you double check? If SAP provides estimates for these, we'll need update our scripts to remove this species. Thanks!

— Reply to this email directly, view it on GitHub https://github.com/afsc-gap-products/gap_products/issues/3#issuecomment-1833096893, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL3OP2GOQ3W32BNUQDE3FRDYHAEW5AVCNFSM6AAAAAA5NBFWHGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZTGA4TMOBZGM . You are receiving this because you were mentioned.Message ID: @.***>

-- Shannon Hennessey, PhD Research Fisheries Biologist Kodiak Fisheries Research Center NOAA | Alaska Fisheries Science Center 301 Research Court, Kodiak, AK 99615 Office: +1 (907) 481-1717 www.fisheries.noaa.gov

EmilyMarkowitz-NOAA commented 7 months ago

Oh so glad I asked then! @zoyafuso-NOAA, you'll remove that species code? I forgot where exactly we filter these codes out, but feel free to tell me and I can push the edit. Thanks @ShannonHennessey!

zoyafuso-NOAA commented 7 months ago

Done, this will be incorporated in the next production run