DATA 2 Recreational Data

brianlangseth-NOAA commented 3 years ago

Issues listed in the data spreadsheet for the recreational fleet(s)

[x] Landings @kellijohnson-NOAA
- [x] Washington
- [x] Oregon
- [x] Northern California (to be added to Oregon but should be available as a separate time series in case selectivity is vastly different
- [x] Southern California
- [ ] document how the landings were created and the many problems of California recreational data not being able to inform landings plus discards
- [x] figure of selectivity and retention and what that means for biomass by @iantaylor-NOAA
[x] Length comps
[x] Age comps
[x] CAAL
[x] CPUE
- [x] CA Rec shore-based (MM)
- [x] DebWV CPFV (MM)
- [x] CA Rec onboard (MM)
- [x] CA CRFS RP (MM)
- [x] OR Rec shore-based (AW)
- [x] OR CPFV (AW)
- [x] OR Rec onboard (AW)
- [x] WA Rec shore-based (IT)
- [x] commercial trawl (JW)
[x] ~discard~ (we've chosen not to use these)

kellijohnson-NOAA commented 3 years ago

Issues that will need to be resolved with respect to recreational catches include the following:

[x] Washington recreational catches were provided with the units of number of fish and not weight of fish (mt) see #30
- Asked WDFW for catches in weight (2021-04-12)

kellijohnson-NOAA commented 3 years ago

@aliwhitman Do you know why there would be differences between the previous recreational landings and this assessment's time series? I used TOTAL MORTALITY_MT in LINGCOD_FINAL RECREATIONAL LANDINGS_1979 - 2020_byMODE on the drive.

Also, do not panic about the differences between WA b/c the old assessment was in numbers. Just note that the peaks and valleys line up.

aliwhitman commented 3 years ago

Yes, so the previous assessment probably used MRFSS landings prior to 2001, as I had not yet created the ODFW comprehensive sport reconstruction. We would officially recommend going with our reconstructed estimates, as there were documented issues with effort estimates that made MRFSS estimates go nuts. It's hard to see from the graph, but it looks like there were no differences after 2000. All of those would be from RecFIN landings, so again, might be minor corrections to those, but should be the same as the 2017 assessment.

kellijohnson-NOAA commented 3 years ago

Thanks Ali for the quick response, here is the figure zoomed in a little

aliwhitman commented 3 years ago

Hmm, so I suppose it's a bit strange to see those differences in 2003 and 2006 (if I'm counting right). But I did have to re-do the sport shore/estuary reconstruction for this species, which would be included in these landings, so that may account for some of those differences. What is the difference in mt between those two years? Is it large enough to be concerned about? I can investigate...

kellijohnson-NOAA commented 3 years ago

imo these are not big enough differences to matter

Year mt seas fleet catch catch_se diff 1 1974 80.38470 1 4 80.3847 0.01 0.0000000 2 1975 84.79830 1 4 84.7983 0.01 0.0000000 3 1976 116.75200 1 4 116.7520 0.01 0.0000000 4 1977 110.20500 1 4 110.2050 0.01 0.0000000 5 1978 118.88200 1 4 118.8820 0.01 0.0000000 6 1979 159.59579 1 4 121.6730 0.01 37.9227910 7 1980 219.95342 1 4 149.7620 0.01 70.1914220 8 1981 174.29725 1 4 117.4490 0.01 56.8482540 9 1982 195.87732 1 4 119.6400 0.01 76.2373203 10 1983 153.08531 1 4 129.0110 0.01 24.0743098 11 1984 170.30633 1 4 143.8820 0.01 26.4243336 12 1985 169.49923 1 4 98.9423 0.01 70.5569338 13 1986 173.64758 1 4 92.4084 0.01 81.2391818 14 1987 186.78770 1 4 122.9340 0.01 63.8537020 15 1988 113.83831 1 4 90.4889 0.01 23.3494147 16 1989 158.07099 1 4 120.0000 0.01 38.0709872 17 1990 127.24230 1 4 96.8932 0.01 30.3490977 18 1991 103.28412 1 4 73.5032 0.01 29.7809234 19 1992 155.51742 1 4 112.4030 0.01 43.1144151 20 1993 237.24333 1 4 145.8750 0.01 91.3683322 21 1994 195.58297 1 4 142.4600 0.01 53.1229669 22 1995 98.43880 1 4 79.5829 0.01 18.8558994 23 1996 118.37793 1 4 93.1602 0.01 25.2177328 24 1997 145.61289 1 4 110.8400 0.01 34.7728901 25 1998 92.69975 1 4 69.9553 0.01 22.7444475 26 1999 87.98690 1 4 79.7256 0.01 8.2612973 27 2000 54.74416 1 4 51.1930 0.01 3.5511642 28 2001 60.61127 1 4 61.7592 0.01 -1.1479301 29 2002 87.62569 1 4 82.4047 0.01 5.2209881 30 2003 107.25035 1 4 122.4960 0.01 -15.2456504 31 2004 111.06830 1 4 108.7280 0.01 2.3402962 32 2005 135.08541 1 4 140.8430 0.01 -5.7575913 33 2006 128.91791 1 4 107.6150 0.01 21.3029125 34 2007 102.41119 1 4 104.0210 0.01 -1.6098127 35 2008 87.10193 1 4 89.3434 0.01 -2.2414713 36 2009 78.54225 1 4 78.7621 0.01 -0.2198466 37 2010 92.01430 1 4 93.9423 0.01 -1.9279989 38 2011 113.53119 1 4 114.9880 0.01 -1.4568078 39 2012 152.59009 1 4 155.2520 0.01 -2.6619086 40 2013 223.46068 1 4 224.0010 0.01 -0.5403221 41 2014 175.40359 1 4 176.0940 0.01 -0.6904111 42 2015 229.05586 1 4 226.1690 0.01 2.8868573 43 2016 152.48415 1 4 154.6620 0.01 -2.1778545 44 2017 183.69398 NA NA NA NA NA 45 2018 222.39387 NA NA NA NA NA 46 2019 171.84769 NA NA NA NA NA 47 2020 172.46572 NA NA NA NA NA

aliwhitman commented 3 years ago

I agree. Those differences are likely almost all shore/estuary catches.

brianlangseth-NOAA commented 3 years ago

@iantaylor-NOAA and @kellijohnson-NOAA I need help navigating all of the recreational data files in the drive (I believe that is the singular location for all the data). Can you confirm the recreational data list below for drawing up comps? Also we need both age and length comps correct?

From the drive: OREGON: Lingcod_MRFSS BIO_1980 - 2003.xlsx RecFIN_LINGCOD_BIO-LW_2001-2020.csv What about the special projects data? Im under the impression we dont work these up.

CALIFORNIA: mrfss_type_3_1980_2003_lingcod.xlsx ~~CRFS Lingcod Lengths 2004-2019.csv~~ [This is just a summary of the data - Not data itself] Should I use the recfin data in lieu of these? The 2003_type_3d_Lingcod.xlsx has the same number of records as mrfss_type_3_1980_2003_lingcod.xlsx. Not sure what the difference is.

WASHINGTON: ~~Lingcod Biodata as of 3_29_2021(More Ages Coming Daily).xlsx~~ Lingcod_Coastal_Sport_05202021.xlsx (sport only)

I also see pulled recfin data as of 4-19-21. Use SD501 and SD506 correct? It appears these fill in the gaps if state data isn't avilable but for years that overlap, I plan to use the state provided data.

I'll start working these up and matching with what we did for dataModerate so my questions may be more informed in a few hours but I wanted to get these questions out there sooner than later to ensure Im capturing all the needed data.

aliwhitman commented 3 years ago

You've got the correct datasets for OR length comps. As for OR age comps, either the RecFIN pull from 4/19 or the spreadsheet I provided (which is a RecFIN pull from 4/14, Oregon_"LINGCOD_RecFIN_ages_1999-2019_nonconfid.csv") should work. Though I have not compared the two to double check.

I didn't think we were going to work up special projects data either, but that's ultimately a call for Ian/Kelli.

kellijohnson-NOAA commented 3 years ago

regarding Washington: I just emailed Theresa asking her if we need an updated file because the number of aged fish per year in Lingcod Biodata as of 3_29_2021(More Ages Coming Daily).xlsx do not match what was pulled from recfin.

brianlangseth-NOAA commented 3 years ago

Is there a list of county codes in california for the recreational data? @kellijohnson-NOAA do you have these from your catch calculations? I need to know which are north or 40`10 and which are south.

melissamonk-NOAA commented 3 years ago

@brianlangseth-NOAA ; Del Norte FIPS 15 is the only county fully north of 40-10. You'll have to split Humboldt FIPS 23.

brianlangseth-NOAA commented 3 years ago

@melissamonk-NOAA Thank you. And just to be sure we are talking apples to apples, Im looking at the CNTY entry. Thus 15 and 23 there?

I ask because there are 33 unique county entries. Im a little surprised seeing as CA doesn't have that many counties along the coast.

melissamonk-NOAA commented 3 years ago

@brianlangseth-NOAA, Yes, they should be the same. I keep a printed [copy of this map handy. Thirty-three does seem a bit high, but there are a number of counties near SF that are on the water, but it should be closer to 20. Let me know if you want a county look-up table for R.

brianlangseth-NOAA commented 3 years ago

Thanks. @melissamonk-NOAA I was able to track down a county list I forgot about and I figured out my issue. The ca_mrfss data includes thousands of WA and OR samples! When removing those I get 18 CA counties.

brianlangseth-NOAA commented 3 years ago

Thank you @aliwhitman for confirmation. I see that the two length datasets for Oregon overlap during 2001-2003. Is there any special treatment needed for those years? My assumption is to use all lengths from both datasets when making the length compositions.

aliwhitman commented 3 years ago

There shouldn't be any issues with using both datasets for those overlapping years. Go for it. :)

brianlangseth-NOAA commented 3 years ago

Thanks all for assistance with the rec comps. These are loaded, along with rda of the comps. Decisions points are outlined in the script, but I briefly describe below.

For Oregon, used a combination of state provided and recfin pulled data depending on year. For Washington, used newly provided sport data only For California, using mrfss and recfin data. Not using 2003_3d data. S

Designating all of Humboldt Country records to "north". This was done to match treatment of the catch. Ultimately excluded fish designated as "released" (~7000 records, which is still a small percentage) Converted the few washington lengths reported in total length to fork length using the Laidig conversion.

Wish list

CAAL

iantaylor-NOAA commented 3 years ago

I update the initial comment in this issue to list the 5 potential recreational CPUE indices that we may use.

I've took a look at the index standardization script for the WA Rec CPUE index to make sure that the new data (2017-2020) provided by WDFW have the identical format as the previous set. Indeed they do, so we're set to go forward with index standardization.

I started working through the script which does the data filtering and there are a number of confusing things that will take some work to get through. At this point, I think we should use the status-quo index covering 1981-2016 while working on getting other data sets working and then revisit this to add the new data before the final model. I'm going to work on the commercial discard rates next as those are impacted by the change in boundary between north and south (as well as having additional years of data).

iantaylor-NOAA commented 3 years ago

Thanks @kellijohnson-NOAA for all the work to provide retained-only rec catch history in Pull Request #71. However, I think you had it right the first time as long as the discards were only dead discards, which I think they were.

Here's my memory of where we settled when discussed this long ago.

catch as retained only, comps as separate retained and total discarded, estimate discards within the model
catch as retained + dead discards, comps as retained + dead discards
catch as retained + dead discards, comps as retained only
catch as retained only, comps as retained only

Option 1 would have required estimates of total discards, not dead discards, to which we would have applied a 7% discard mortality rate and we couldn't "unscramble the egg" to get total rec discards. Option 2 would have required somehow subsampling the discarded subset of the comps to give them only 7% weight in the total comp, which would have been messy or impossible Option 3 is the option that I thought we chose. The problem here is that it creates a mismatch between the length comps and the catch inputs. However, the illustration below (just cleaned up slightly in commit 009b08dcaaa434bc97bafd0647c64fc3555e7543 from the version presented at a meeting, but still needing a caption) shows that the difference in expected length composition between retained-only samples and retained + dead discards (if that were available) is pretty tiny and unlikely to impact the model. Option 4 (available via Pull Request #71) doesn't have a good way to account for the additional mortality associated with the dead discards. It's not that much mortality because the survival rate is so high and there isn't that much discarding, but if we can account for it, we should, and Option 3 seems like an adequate way to do so. We could add this to the sensitivity wish list (#43), but I don't see a benefit in making the switch for the base model. Or we could run the sensitivity right now and use that to help make a decision.

rec_selectivity_illustration

kellijohnson-NOAA commented 3 years ago

Right, now I remember. Sorry for my panic. I will close the pull request and note that the change in the code is how we can get to the fourth bullet relatively easily.

iantaylor-NOAA commented 3 years ago

Figure and text on this topic added in 5ba3a65a6a43c77ab4628ea2cf8b6bcfb77234d2. Maybe could use some edits, but seems adequate to close this issue.

pfmc-assessments / lingcod

DATA 2 Recreational Data #20