DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

Modify Univ of MO reservoir lat/longs and update pipeline #340

Closed padilla410 closed 2 years ago

padilla410 commented 2 years ago

Closes #289

I manually updated lat/longs for the following reservoirs from the University of MO data set:

The provided lat/longs that came along with the temperature data did not intersect with NHDHR because they were located on land. I manually selected new lat/longs using google maps. I then manually updated 1_crosswalk_fetch/in/UniversityofMissouri_2017_2020_Profiles.csv and updated the following explainer files (UniversityofMissouri_****_Profiles_explainer.docx) on google drive.

To verify the work, I completed 3 checks:

  1. Verification that the lat/longs are plotting appropriately
  2. Verification that the NHDHR crosswalk includes eight lakes
  3. Verification of record update in all_coop_dat_linked.feather

and then built the pipeline through 8_viz

Verification that the lat/longs are plotting appropriately

Lake Wapapello image

Table Rock Lake (the zoomed in map labels are wrong on this one) image

image

Verification that the NHDHR crosswalk includes eight lakes

> readRDS(sc_retrieve(
+   '2_crosswalk_munge/out/univ_mo_nhdhr_xwalk.rds.ind'
+ ))
# A tibble: 8 x 4
  site_id         Missouri_ID  `Lake Name`              County 
  <chr>           <chr>        <chr>                    <chr>  
1 nhdhr_120032533 Missouri_100 Bull Shoals Lake         Ozark  
2 nhdhr_120032269 Missouri_149 Lake Ozark - Bagnell Dam Miller 
3 nhdhr_106716325 Missouri_93  Lake Stockton            Cedar  
4 nhdhr_120032180 Missouri_145 Mark Twain Lake          Ralls  
5 nhdhr_102216470 Missouri_92  Pomme De Terre Lake      Hickory
6 nhdhr_120032884 Missouri_98  Table Rock Lake          Stone  
7 nhdhr_120031146 Missouri_89  Truman Res.              Benton 
8 nhdhr_120032495 Missouri_30  Wapapello Lake           Wayne 

Verification of record update in all_coop_dat_linked.feather

Checking on the number of records in the final dataset:

> old <- arrow::read_feather('7a_temp_coop_munge/out/all_coop_dat_linked - Copy.feather') # updated 2022-05-09
> new <- arrow::read_feather('7a_temp_coop_munge/out/all_coop_dat_linked.feather') # updated 2022-05-19
> 
> nrow(old)
[1] 7127642
> nrow(new)
[1] 7167418
> new_recs <- nrow(new) - nrow(old)
> new_recs
[1] 39776

We can see that the four additional lakes are in all_coop_dat_linked.feather when we compare the old data to the new:

> new_umo <- new %>% 
+   filter(source == '7a_temp_coop_munge/tmp/UniversityofMissouri_LimnoProfiles_2017_2020.rds') %>% 
+   pull("Lake Name") %>% 
+   unique
> 
> old_umo <- old %>% 
+   filter(source == '7a_temp_coop_munge/tmp/UniversityofMissouri_LimnoProfiles_2017_2020.rds') %>% 
+   pull("Lake Name") %>% 
+   unique
> old_umo
[1] "Truman Res."     "Lake Stockton"   "Mark Twain Lake"
> new_umo
[1] "Wapapello Lake"           "Truman Res."              "Pomme De Terre Lake"     
[4] "Lake Stockton"            "Table Rock Lake"          "Mark Twain Lake"         
[7] "Lake Ozark - Bagnell Dam"

Digging in to why we don't have 8 lakes in all_coop_dat_linked.feather:

> # checking on raw parsed data - 
> # looks like Bull Shoals is not present
> umo_munged <- 
+   readRDS('7a_temp_coop_munge/tmp/UniversityofMissouri_LimnoProfiles_2017_2020.rds')
> 
> umo_munged %>% 
+   pull("Missouri_ID") %>% 
+   unique
[1] "Missouri_30"  "Missouri_89"  "Missouri_92"  "Missouri_93"  "Missouri_98"  "Missouri_145"
[7] "Missouri_149"

I did a quick check on the files in UniversityofMissouri_LimnoProfiles_2017_2020.zip and verified that "Missouri_100" (aka Bull Shoals) is not present in the unparsed data set.

Results from 8_viz

This is the updated summary table from lakes_summary_fig.html: image This summary table is quite a bit different than my most recent run in PR #338 (9,369 vs 8,672 GLM lakes). I have not been able to ID the PR where the number of GLM lakes increased.

But, to verify what I've done here, I have a few screenshots.

Table Rock Lake (2022-05-09): image

Table Rock Lake (2022-05-19): image

Wapapello Lake (2022-05-09): image

Wapapello Lake (2022-05-19) 😎 : image

lindsayplatt commented 2 years ago

Also, cannot express enough how easy to follow and jump in you make these PRs! πŸ₯°πŸ˜β€ love it

padilla410 commented 2 years ago

Fabulous. I am going to accept your recommendation and add a bit more context to the "overview" at the top of the PR. Then I will merge.