pfmc-assessments / PacFIN.Utilities

R code to manipulate data from the PacFIN database for assessments
http://pfmc-assessments.github.io/PacFIN.Utilities
Other
7 stars 2 forks source link

Nominal catches and unidentified species #55

Open kellijohnson-NOAA opened 3 years ago

kellijohnson-NOAA commented 3 years ago

Sorry to @chantelwetzel-noaa and @brianlangseth-NOAA for providing catches without nominal catches for square spot. This led me to investigating how nominal was being searched for.

We should always provide nominal catches right? Please provide a thumbs up or thumbs down if you think we should always provide nominal catches.

Then, I went down a rabbit hole of looking at unidentified species. Would we ever work with a species that used "UNSP xxx" catches? Like "UNSP. Shelf Rockfish" and I see "UNSP. Gopher rockfish". I think this question really pertains to @shcaba and @melissamonk-NOAA. Can people please provide comments on how to best provide unidentified species that we think would be applicable. Perhaps Vlada and @iantaylor-NOAA can comment on skates as I see "UNSP. Skate" too.

John-R-Wallace-NOAA commented 3 years ago

So that everyone is on the same page, the Comprehensive Fish Ticket Table document ( https://pacfin.psmfc.org/wp-content/uploads/2016/06/PacFIN_Comprehensive_Fish_Tickets.pdf ) states on page 2 under the title 'Background':

 The comprehensive fish ticket table (i.e., COMPREHENSIVE_FT) is based on state fish tickets, enhanced
 by applying the state agencies’ catch-by-area and species composition proportions to correct catch areas,
 nominal species categories and multispecies market categories for groundfish landings. These
 enhancements are built alongside the original raw data to allow PacFIN users the ability to query both the
 enhanced as well as raw data from the same table.

Using Vermilion Rockfish as an example from the Comprehensive FT table, where the finding of 'RCK8' is explained below, we have a tabulation of PACFIN_SPECIES_CODE, NOMINAL_TO_ACTUAL_PACFIN_SPECIES_CODE, and their cross-tab:

 > Table(Verm_CompFT$PACFIN_SPECIES_CODE)
   RCK8   VRM1   VRML 
     21  19277 225307 

 > Table(Verm_CompFT$NOMINAL_TO_ACTUAL_PACFIN_SPECIES_CODE)
   RCK8   VRML 
     21 244584 

 > Table(Verm_CompFT$PACFIN_SPECIES_CODE, Verm_CompFT$NOMINAL_TO_ACTUAL_PACFIN_SPECIES_CODE)
          RCK8   VRML
   RCK8     21      0
   VRM1      0  19277
   VRML      0 225307

Note first that NOMINAL_TO_ACTUAL_PACFIN_SPECIES_CODE column is called NOMINAL_TO_ACTUAL_SPECIES_CODE (PACFIN IS missing) at the very bottom of Table 1 on page 10 of the doc.

Given what is stated in the Background section, it appears that species compositions were applied to all the VRM1 and that's why none appear in NOMINAL_TO_ACTUAL_PACFIN_SPECIES_CODE. Contrast this to RCK8 where no species composition was applied and it remains in NOMINAL_TO_ACTUAL_PACFIN_SPECIES_CODE. The assessor would contact the states for any info on species comps for RCK8 and go from there. The cross-tab is also useful to see the amount of nominal spids that the species comps were applied to versus the actual spids.

Finding 'RCK8'

The second argument of my PacFIN.Catch.Extraction() function is called PacFIN_Common_Name and from the help is: "A character vector of PacFIN species common name used only for checking for a nominal species ID. The function will stop after the species codes are printed."

Using only 'Vermilion' was useful in this case since rockfish in RCK8's common name is spelled without an 'O', and hence searching for 'Vermilion rockfish' only would have given the VRML spid. The nominal VRM1 spells Vermilion with 2 'l's and hence needs a separate call.

> PacFIN.Catch.Extraction(PacFIN_Common_Name = 'Vermilion')

    SPID                   CNAME COMPLEX COMPLEX2 COMPLEX3 MGRP             SNAME
96  VRML      VERMILION ROCKFISH    ROCK     ....     NSLF GRND SEBASTES MINIATUS
239 RCK8 CANARY+VERMILION RCKFSH    ROCK     ....     NSLF GRND     SEBASTES SPP.

--- Checking for a nominal species ID using the PacFIN common name. ---    

> PacFIN.Catch.Extraction(PacFIN_Common_Name = 'Vermillion rockfish')

    SPID                    CNAME COMPLEX COMPLEX2 COMPLEX3 MGRP         SNAME
366 VRM1 NOM. VERMILLION ROCKFISH    ROCK     ....     NSLF GRND SEBASTES SPP.

If the misspelling is unknown then the user would only use only VRML and RCK8, however when the function is run another check is done using the using the first 3 letters of the first SPID:

> PacFIN.VRML.Catch.INPFC.28.Jan.2021 <- PacFIN.Catch.Extraction("('VRML', 'RCK8')") #  RCK8 CANARY+VERMILION RCKFSH

Checking for nominal species names using the first 3 letters of the SPID listed first in the PACFIN_SPECIES_CODE argument

In rare cases this doesn't work to find the nominal species ID.

    SPID                    CNAME COMPLEX COMPLEX2 COMPLEX3 MGRP             SNAME
96  VRML       VERMILION ROCKFISH    ROCK     ....     NSLF GRND SEBASTES MINIATUS
366 VRM1 NOM. VERMILLION ROCKFISH    ROCK     ....     NSLF GRND     SEBASTES SPP.

...

The unknown misspelling is revealed, and the function would be restarted with the now revealed VRM1.

brianlangseth-NOAA commented 3 years ago

@John-R-Wallace-NOAA This is good background. Id be curious to know whether you think we should include nominal catches though. Perhaps its clear to everyone else, or maybe I missed it, but are you in favor of including nominal catches. Is there a solution other than "The assessor would contact the states for any info on species comps for RCK8 and go from there"?

John-R-Wallace-NOAA commented 3 years ago

Firstly, I assume you are talking about the nominal catches that do not already have a species composition applied to them.

Others would have more insight, but I did give this some thought.

I believe the general consensus is to include the assessment species proportion of the non-split nominal catches in the best way you can, unless it is so trivial it can be ignored.

There is a difference between those states that have completed an historical catch reconstruction and those that have not. Oregon has finished theirs and Vlada was on the team that did the reconstruction:

HISTORICAL RECONSTRUCTION OF OREGON’S COMMERCIAL FISHERIES LANDINGS Report of the Groundfish Historical Catch Reconstruction Workshop

The historical reconstruction team would have taken a fresh look at the non-split nominal catches and would have borrowed species comps both spatially and temporally. So if the team left any non-split nominal catches there must be little additional information. In that case, one could split the catch evenly by the number of species in the complex. I could see checking old assessments, especially if the complex is only two species, to make sure the other species abundance is not extremely different to the species being assessed in area and time the complex was used. If one of the species was in the tank then an even split might be reconsidered.

In a state without a historical catch reconstruction there may be institutional knowledge on a historical complex's composition. I am thinking both about folks that work for that state's fishery department and fishers who caught fish in that complex (perhaps from another state). When asked about RCK8, for example, someone may know that mostly Canaries where caught with only a few Vermilion. Other people like John DeVore and Dr. Hastie may also have insights or know who you should contact. And, of course, if the species as been assessed before, the old assessment is a resource.

I didn't see that this topic was in the unofficial handbook, perhaps it could be added after others correct my assumptions.

kellijohnson-NOAA commented 3 years ago

@melissamonk-NOAA do you want to chime in about Vermillion? Also, do you have access to pacfin or do you need one of us that has access to provide data for you?

melissamonk-NOAA commented 3 years ago

@kellijohnson-NOAA I do have PacFIN access. @EJDick-NOAA can respond more elegantly to this issue than I can this cycle. He's taken the lead on wrangling catch data for both vermilion assessments.

EJDick-NOAA commented 3 years ago

I checked the catch estimates for vermilion in PacFIN, and they are consistent with the original expansion in CALCOM. All of the landings with a 'nominal' source code in CALCOM were correctly assigned to the 'VRM1' species code and market category 249 (nominally "vermilion rockfish") in PacFIN. (For background, CALCOM only uses one species code per species, and a 'source' field to distinguish 'nominal' landings from those which were assigned based on species comp samples.)

The table below shows the number of records in the comprehensive fish ticket table in PacFIN by market category and "pacfin species code." All the nominal (VRM1) landings are in market category 249, as they should be. Landings categorized as VRML are spread across multiple market categories, since they were derived from applying species composition data to market categories on the fish tickets. The nominal VRM1 landings market category 249 had no samples (and no 'nearby' samples to borrow from), so they are assumed to be 100% vermilion.

           PACFIN_SPECIES_CODE

MARKET_CATEGORY VRM1 VRML 245 0 1100 246 0 5 247 0 3034 249 12853 34988 250 0 62223 252 0 402 253 0 1516 254 0 265 255 0 126 258 0 72 259 0 11 262 0 151 263 0 379 265 0 1116 267 0 423 268 0 11 269 0 58 270 0 1 651 0 20 652 0 1 655 0 645 659 0 14 662 0 2 663 0 9 665 0 194 667 0 269 956 0 1241 957 0 3689 958 0 111 959 0 70180 960 0 4963 961 0 672 962 0 13031 971 0 4 973 0 40 974 0 352 975 0 1

The code "RCK8" is a group category for canary & vermilion, and is the same as market category 971 in PacFIN (for California, anyway, not sure about OR & WA). Like John W., I had to query these separately (number of records shown below). In the table above, the 4 records in mcat 971 are expanded estimates based on 3 samples that contained vermilion, all port complex OSF in 1994 (hence the 'VRML' species code -- based on data -- or source = 'actual' in CALCOM). The rest of the records in the category (table below) had no samples and were therefore "nominal" but in this market category that could mean either canary or vermilion. I think John W's point was that any species with a large fraction of landings in this type of category should be examined more closely. In the case of vermilion, nominal landings in the 971 category sum <0.3 mt across all years, so it's safe to ignore.

           PACFIN_SPECIES_CODE

MARKET_CATEGORY RCK8 971 21

Sorry about the terrible formatting of tables... still need to figure that out.

kellijohnson-NOAA commented 1 year ago

As suggested by @iantaylor-NOAA, I am going to make a look up table that is stored in {PacFIN.Utilties} that lists nuanced species codes that should be "checked" or "looked into" beyond just the nominal code. So, addnominal will only add the "official" nominal code for a species but when you pull data the function will warn you that you might want to look at other random species codes. As we continue to assess more species we can develop this table further. For example, if you are pulling data for vermilion or canary, the function would alert you to also look at RCK8 but it would not pull the data for you. Thank you all for the suggestions over the past TWO years 😮 . Now that you can pull multiple species codes at once I do not think this will be a major problem.