Updated juvenile data includes multiple rows per year for Duncan Channel and Grays MS

ebuhle commented 2 years ago

Hi @kalebentley, another question about a raw data update that breaks the data-processing code. I eventually traced this one back to the juvenile abundance data, specifically the fact that the new file contains multiple rows per year for Grays MS 2008-2021 and Duncan Channel 2003-2021. Below is the relevant chunk of juv_data, as constructed here (note that the file name in that version is out of date b/c I haven't pushed changes yet). No other pops / years have duplicate rows. It looks like the extra rows are due to distinguishing between wild and hatchery origin. I'm guessing these may have been added in preparation for modeling hatchery releases and returns, but I haven't cross-checked the new raw data file with the previous one to confirm that the old entries correspond to only the rows with origin == "Wild". Is that correct? If so, it should be easy enough to add that as a filter when constructing juv_data_incl.

         location year          origin     M_obs   tau_M_obs
1        Grays MS 1999  Hatchery_Grays  102901.0          NA
2        Grays MS 2000  Hatchery_Grays  134661.0          NA
3        Grays MS 2001  Hatchery_Grays  202833.0          NA
4        Grays MS 2002  Hatchery_Grays  305185.0          NA
5        Grays MS 2003  Hatchery_Grays  398000.0          NA
6        Grays MS 2004  Hatchery_Grays  357000.0          NA
7        Grays MS 2005  Hatchery_Grays  321000.0          NA
8        Grays MS 2006  Hatchery_Grays  155501.0          NA
9        Grays MS 2007  Hatchery_Grays  129457.0          NA
10       Grays MS 2008  Hatchery_Grays  147609.0          NA
11       Grays MS 2008            Wild 1866291.0 0.226600250
12       Grays MS 2009  Hatchery_Grays  102914.0          NA
13       Grays MS 2009            Wild 1288376.0 0.046554338
14       Grays MS 2010  Hatchery_Grays  122505.0          NA
15       Grays MS 2010            Wild 2882565.0 0.048369625
16       Grays MS 2011  Hatchery_Grays  250003.0          NA
17       Grays MS 2011            Wild 2203544.0 0.103147830
18       Grays MS 2012  Hatchery_Grays  199842.0          NA
19       Grays MS 2012            Wild 2451852.0 0.321384175
20       Grays MS 2013  Hatchery_Grays  159682.0          NA
21       Grays MS 2013            Wild 2664989.0 0.077219115
22       Grays MS 2014  Hatchery_Grays  148804.0          NA
23       Grays MS 2014            Wild 2711159.0 0.161962097
24       Grays MS 2015  Hatchery_Grays  192156.0          NA
25       Grays MS 2015            Wild 1106601.0 0.595510419
26       Grays MS 2016  Hatchery_Grays  141581.0          NA
27       Grays MS 2016            Wild 1306160.0 0.186600975
28       Grays MS 2017  Hatchery_Grays  131353.0          NA
29       Grays MS 2017            Wild 3043076.0 0.152436082
30       Grays MS 2018  Hatchery_Grays  131289.0          NA
31       Grays MS 2018            Wild 3519678.0 0.049708095
32       Grays MS 2019  Hatchery_Grays  130956.0          NA
33       Grays MS 2019            Wild 5278182.5 0.066195010
34       Grays MS 2020  Hatchery_Grays  195482.0          NA
35       Grays MS 2020            Wild  940323.5 0.107817345
36       Grays MS 2021  Hatchery_Grays  187063.0          NA
37       Grays MS 2021            Wild 2946175.0 0.105042619
38 Duncan Channel 2002 Hatchery_Duncan   45046.0          NA
39 Duncan Channel 2003            Wild   25478.0          NA
40 Duncan Channel 2003 Hatchery_Duncan  217436.0          NA
41 Duncan Channel 2004            Wild   45450.0          NA
42 Duncan Channel 2004 Hatchery_Duncan   75995.0          NA
43 Duncan Channel 2005            Wild   27814.0          NA
44 Duncan Channel 2005 Hatchery_Duncan       0.0          NA
45 Duncan Channel 2006            Wild   31213.0 0.006244135
46 Duncan Channel 2006 Hatchery_Duncan   19817.0          NA
47 Duncan Channel 2007            Wild   29707.0          NA
48 Duncan Channel 2007 Hatchery_Duncan   54390.0          NA
49 Duncan Channel 2008            Wild   25659.0          NA
50 Duncan Channel 2008 Hatchery_Duncan       0.0          NA
51 Duncan Channel 2009            Wild   27569.0          NA
52 Duncan Channel 2009 Hatchery_Duncan       0.0          NA
53 Duncan Channel 2010            Wild   32161.0          NA
54 Duncan Channel 2010 Hatchery_Duncan   25813.0          NA
55 Duncan Channel 2011            Wild   28827.0 0.057200898
56 Duncan Channel 2011 Hatchery_Duncan   59238.0          NA
57 Duncan Channel 2012            Wild   62229.0 0.079248291
58 Duncan Channel 2012 Hatchery_Duncan   55901.0          NA
59 Duncan Channel 2013            Wild   36160.0          NA
60 Duncan Channel 2013 Hatchery_Duncan   57538.0          NA
61 Duncan Channel 2014            Wild   32090.0 0.044970132
62 Duncan Channel 2014 Hatchery_Duncan   46002.0          NA
63 Duncan Channel 2015            Wild   48213.0          NA
64 Duncan Channel 2015 Hatchery_Duncan   88217.0          NA
65 Duncan Channel 2016            Wild   65550.0 0.003889908
66 Duncan Channel 2016 Hatchery_Duncan   87802.0          NA
67 Duncan Channel 2017            Wild        NA          NA
68 Duncan Channel 2017 Hatchery_Duncan  111209.0          NA
69 Duncan Channel 2018            Wild   41369.0          NA
70 Duncan Channel 2018 Hatchery_Duncan  105986.0          NA
71 Duncan Channel 2019            Wild   59560.0 0.097971672
72 Duncan Channel 2019 Hatchery_Duncan  105452.0          NA
73 Duncan Channel 2020            Wild   10010.0 1.233879787
74 Duncan Channel 2020 Hatchery_Duncan  110471.0          NA
75 Duncan Channel 2021            Wild   59401.0 0.042585052
76 Duncan Channel 2021 Hatchery_Duncan  103250.0          NA

kalebentley commented 2 years ago

@ebuhle, yes: I updated the juvenile data file today and changed the "structure" of the hatchery outplants. I was imagining we would go over these details tomorrow. In short, adults that are collected for the Grays River Hatchery are ultimately used to produced eggs/fry that are outplanted multiple locations. The previous structure of the data did not capture this nuance. Nothing about the Duncan data changed except the wording of the "Location" and "Origin" are specified.

ebuhle commented 2 years ago

Got it, thanks. It does seem like something must have changed about Duncan Channel because I haven't used origin at all in processing the juvenile data and I never got these duplicate pop / year cases before, but maybe I'm misunderstanding what you mean by the wording. In any case, I'll just add a filter for Wild-only for now.

kalebentley commented 2 years ago

Right, the Duncan Channel data did change (slightly), too.

Here's an example of the structure of the juvenile data prior to today:

And here's an example of the updated structure I implemented today (subject to change with your input):

As you can see, the "wild" juvenile data hasn't changed at all. However, the hatchery data has changed in three ways:

Location.Reach -- I've updated the Location.Reach to represent where the hatchery juveniles are actually outplanted (e.g., Grays River juveniles are outplanted into "Grays_MS" instead of it stating "Grays_Hatchery"; Duncan juveniles are outplanted [near] the "Duncan_Channel" rather than "Duncan_Hatchery"). This is change is needed because of number 3 (below).
Origin - Here, instead of just stating "Wild" or "Hatchery", I've specified the hatchery that produced the outplant.
Additional outplants that weren't previously in data set -- As you can see in the Grays data from BY2019/outplant 2020, the adults that were collected from the Grays River (specifically, 222 of them) were spawned and used to generate four different outplants of fry. Before, the data set only included the Grays River plant (of ~195K) and not outplants to Big Creek (Oregon), Skamokawa spawning channel, and a remote site incubator (RSI) known as "Peterson RSI". The previous omission of these additional outplants shouldn't have affected any of out existing generated metrics but will matter when we start adding in these other populations (e.g., Skamokawa) and generating other metrics (e.g., in-hatchery survival).

ebuhle commented 2 years ago

Ahh, OK, that all makes sense. Thanks for laying it out, and sorry if I pre-empted our conversation tomorrow in my eagerness to re-fit the models today.

ebuhle / LCRchumIPM

Updated juvenile data includes multiple rows per year for Duncan Channel and Grays MS #15