pfmc-assessments / PacFIN.Utilities

R code to manipulate data from the PacFIN database for assessments
http://pfmc-assessments.github.io/PacFIN.Utilities
Other
7 stars 1 forks source link

Data request for species Petrale Sole (Eopsetta jordani) #87

Closed iantaylor-NOAA closed 1 year ago

iantaylor-NOAA commented 1 year ago

Identify the common name(s) or PacFIN species code(s)

GRND FLAT PTR1                 NOM. PETRALE SOLE               
GRND FLAT PTRL                 PETRALE SOLE           

Type of data needed

catch, biological data

Additional information

requested by @gertsevv and @iantaylor-NOAA @chantelwetzel-noaa has volunteered to pull this data

EDIT (checklist added on 6 April 2023):

Do you you have clearance to access these data?

chantelwetzel-noaa commented 1 year ago

Completed.

iantaylor-NOAA commented 1 year ago

Email today from @aliwhitman says 2022 petrale sole ages have now been added to PacFIN BDS. Is it OK to re-open this issue to get a new BDS pull, or should I post a new one?

kellijohnson-NOAA commented 1 year ago

@iantaylor-NOAA re-opening an issue is just fine, and perhaps preferred rather than opening a new one.

I pulled the data and I am encountering errors with data uploaded from CalCOM where there are duplicate entries with unique BDS_IDs. So, I will need to email Brenda Irwin and ask what is going on. So, I am not super comfortable sharing this data with the updated information from Oregon. Hopefully, California will fix the duplicated data and it will be automatically uploaded tomorrow to PacFIN. So, stay tuned. If they don't get it fixed quickly I will just manually remove them before sharing the data.

chantelwetzel-noaa commented 1 year ago

@kellijohnson-NOAA In a meeting I was in with CDFW they mentioned a duplicate record issue with data between 2016-2018ish. If the issue you are seeing aligns with these years, I think they are already reaching out to people about fixing this. However, reaching out to Brenda may be useful to understand the potential timeline for remedying this issue.

kellijohnson-NOAA commented 1 year ago

Thanks @chantelwetzel-noaa for the intel.

brianlangseth-NOAA commented 1 year ago

I imagine this issue exists for all species. We are expecting updated ages for canary too, and so will similarly request updated pacfin bds data (in addition to updated catch files for complete 2022 values). Will do so in another issue.

iantaylor-NOAA commented 1 year ago

@brianlangseth-NOAA, good point. We definitely don't want to be constantly requesting data pull. However, in the case of Petrale there's a public meeting on Monday where we're talking about these ages, so it would be good to have access whatever is in the database.

@kellijohnson-NOAA, for purposes of the meeting, it would be helpful to know how many ages area available from Oregon. The BDS data we have so far shows this following monthly distribution of the 2022 samples and zero samples form Oregon in 2021.

@aliwhitman, you probably already told me and I forgot, but is it correct that there should be zero samples from 2021?

r$> bds.pacfin %>%
      dplyr::filter(SAMPLE_YEAR == 2022, AGENCY_CODE == "O", !is.na(FINAL_FISH_AGE_IN_YEARS)) %>%
      select(SAMPLE_MONTH) %>%
      table()
SAMPLE_MONTH
 1  2  3  4  5  6  7 
41 77 43 46 56 63 24 
kellijohnson-NOAA commented 1 year ago
bds.pacfin %>% 
 dplyr::filter(SAMPLE_YEAR >=2020,AGENCY_CODE=="O",!is.na(FINAL_FISH_AGE_IN_YEARS)) %>%
 dplyr::count(SAMPLE_YEAR,SAMPLE_MONTH)

   SAMPLE_YEAR SAMPLE_MONTH  n
1         2020            1 59
2         2020            2 81
3         2020            3 71
4         2020            4 45
5         2020            5 55
6         2020            6 50
7         2020            7 35
8         2020            8 43
9         2020            9 30
10        2020           10 35
11        2020           11 25
12        2020           12 55
13        2021            1 42
14        2021            2 66
15        2021            3 42
16        2021            4 55
17        2021            5 98
18        2021            6 18
19        2021            7 12
20        2021            8 54
21        2021            9 36
22        2021           10 48
23        2021           11 90
24        2021           12 63
25        2022            1 41
26        2022            2 77
27        2022            3 43
28        2022            4 46
29        2022            5 56
30        2022            6 63
31        2022            7 40
32        2022            8 32
iantaylor-NOAA commented 1 year ago

@kellijohnson-NOAA, thank you. Good to see all the 2021 ages and more samples from July and August in 2022.

aliwhitman commented 1 year ago

Just confirming that Kelli's sample counts are accurate from my PacFIN BDS double checking as well. So I think we're all set!

kellijohnson-NOAA commented 1 year ago

@iantaylor-NOAA I have placed a bds file on the network (i.e., fram\Assessments\CurrentAssessments\petrale_2023\data). There are additional ages from Oregon in there up to November 2022. Please note that the following:

  1. There are some "Purposive" samples from Washington in the data set. These will need to be removed prior to using the data for the assessment. Kristen Hinton made me aware that sometimes these samples are sampled for weight two times, leading to two rows of data, to determine whole weight and headed and gutted weight. Therefore, I have created a new column FISH_WEIGHT_GUTTED instead of having a single fish take up two rows of data. There are only 8 samples that have this information in your data set.
  2. The duplicative records from California in 2021 and 2022 have been removed. Brenda is checking into why these are there in the first place. Nevertheless, they appear to be EXACT duplicates so I just went ahead and removed them. I am NOT going to close this issue because you will still need a final data set come May 1, 2023, which I think is your data deadline.
iantaylor-NOAA commented 1 year ago

Thank you @kellijohnson-NOAA. This is all helpful information. @gertsevv and I will make use of the updated data and look forward to the final extraction on May 1, which is indeed the deadline.

iantaylor-NOAA commented 1 year ago

@kellijohnson-NOAA, I have not heard anything from the states indicating that petrale data have changed since the extraction you did on April 5. However, just for completeness, would you be willing to check the sample sizes (or whatever other method you prefer) to see if anything has been added?

This is what I'm seeing in the file in \\nwcfile\FRAM\Assessments\CurrentAssessments\petrale_2023\data\PacFIN.PTRL.bds.05.Apr.2023.RData

r$> nrow(bds.pacfin)
[1] 263103

r$> sum(!is.na(bds.pacfin$FINAL_FISH_AGE_IN_YEARS))
[1] 104874
kellijohnson-NOAA commented 1 year ago

Results of the above code with the new pull are

263106
105302

so it looks like there are more ages that must have been uploaded to fish records that were already in the system. The new file is on the network. Sorry for the delay btw. I was on annual leave last Thursday.

iantaylor-NOAA commented 1 year ago

Thank you @kellijohnson-NOAA, the timing is just fine. Thank you for pulling the data and happy to get the additional 428 ages.