afsc-assessments / afscdata

An R package for data extraction of AFSC survey and fishery data
https://afsc-assessments.github.io/afscdata/
Other
2 stars 0 forks source link

switch to change filter #11

Closed BenWilliams-NOAA closed 1 year ago

BenWilliams-NOAA commented 1 year ago

@JaneSullivan-NOAA I added a switch to change the catch filter from group code to agency code (couldn't remember if you renamed that afsc_code, nmfs_code? so left it as agency_species_code) - let me know if this takes care of the issue for your needs.

https://github.com/afsc-assessments/afscdata/blob/320353f860b9e43d7c66557fed4a24006d322007/R/queries.R#L9

JaneSullivan-NOAA commented 1 year ago

stringr::str_replace() is for sure a preferred solution to having a separate query

use_afsc_code should probably be refactored... maybe use_agency_code or use_species_code instead? it's called "agency" because three digit codes are a mix of ADF&G and AKR species codes. there are two "AFSC codes", RACE and NORPAC.

BenWilliams-NOAA commented 1 year ago

@JaneSullivan-NOAA would appreciate thoughts on going toward dbplyr to keep things more in the R context - which i find easier for using functions/switches etc. this test.R function should do the same task as q_catch though it accounts for the different species inputs (group code or agency code), can add additional columns easily, and covers the other issue of fmp_area/fmp_subarea

JaneSullivan-NOAA commented 1 year ago

@BenWilliams-NOAA just pulled and will test asap. My initial thoughts: I really like that it simplifies the existing framework and is more flexible and readable. I see the utility in an intermediate step that prints, saves, or allows the user to access the translated SQL code for modification or other use. I'm also curious about speed. Will follow up soon.

JaneSullivan-NOAA commented 1 year ago

@BenWilliams-NOAA From email exchange:... "a few bugs with area vector length and the species code data type."

  1. Looks like you already fixed the issue with area vector length here using isTRUE()
  2. The species code data type issue has to do with is.numeric(species)

If you input species as numeric (e.g. species = c(153, 154), you will end up with

FROM (council.comprehensive_blend_ca) "q01"
WHERE
  ("YEAR" <= 2022.0) AND
  ("FMP_SUBAREA" IN ('WG', 'CG', 'WY', 'EY', 'SE')) AND
  ("AGENCY_SPECIES_CODE" IN (153.0, 154.0))

If you input species as a character (e.g. species = c('153', '154'), you end up with

WHERE
  ("YEAR" <= 2022.0) AND
  ("FMP_SUBAREA" IN ('WG', 'CG', 'WY', 'EY', 'SE')) AND
  ("SPECIES_GROUP_CODE" IN ('153', '154'))
BenWilliams-NOAA commented 1 year ago

adjusted the statement so that it is looking for either 3 digits in the string or not i think that takes care of this issue (thanks for looking into it!). i put a note in the @ param that the input needs to be either all 3 or all 4 digit codes