Open chantelwetzel-noaa opened 4 months ago
Hi Chantel, The AFSC Groundfish Assessment Program (GAP) calculates agecomps, sizecomps, and biomass indices from raw data in their database using the gapindex package. These indices and other ready-for-stock assessment data are then available in the gap_products schema on their oracle database. GAP also transfers the gap_products schema to the Alaska Fisheries Information Network (AKFIN, my employer) for distribution. I created an api for each gap_products table, and the akfingapdata package is a wrapper for those apis, with each function pulling the data from one table, by species and area (or the whole table for smaller tables).
Since GAP has already put in the heavy lifting of calculating indices and vetting specimen/lengths/catch/etc. in the gap_products framework I think it makes sense to use that for these visualizations in Alaska.
I think that all regions have put in a lot of work to create code and data and that part of the process here is working towards shared code and similar data structures. It would be really great to slowly work towards a shared set of functions to process the data. This would (1) reduce the amount of code needed to make this effort happen, (2) lead to less errors in code because more eyes would be using and reviewing it, and (3) create a process of working together on more than just plots. I understand if that cannot happen at this stage but at a minimum I think we should have a conversation about what the "input" data here look like.
I have been reviewing the code for pulling, processing, and plotting AFSC survey data created and shared by @MattCallahan-NOAA. Based on the readme in the akfingapdata repository it appears that the pulling and processing of catch and biological data are done within a single function (or the data stored in the database have already been expanded) by the
get_gap_biomass()
,goasr_sizecomp()
, andget_gap_agecomp()
. Is this correct?Here is a description on how the NWFS survey data are stored and processed:
pull_catch()
andpull_bio()
the in the nwfscSurvey package.Biomass.fn()
.SurveyAFs.fn()
. The output of this function is a formatted matrix of the proportion by year, sex, and size/age bins.In my mind, there are a few different potential pathways here:
Once we have decided upon this, then we can dig into creating unified data frames.