DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

Suspect bathy for two WI lakes #357

Closed lindsayplatt closed 2 years ago

lindsayplatt commented 2 years ago

Jacob noticed that there are two lakes in our lake metadata CSV on ScienceBase with > 200 m depths, while their depths according to the WI DNR website are 18 ft and 29 ft. I spent a few minutes investigating and it appears that the bathymetry data provided for these two sites via 3_params_fetch/in/WBIC_hypsos_lakeattributes.zip have very deep depths - 228 and 383 (assumed meters), respectively. The only thing I can think is that they are either wrong OR recorded in inches (which would give 19 ft and 32 ft, respectively).

This is sort of related to #227, but most other depths in the dataset are reasonable (see histograms below)

library(dplyr)
library(ggplot2)

# Quick summary of lake depths in the pipeline
munged_lakedepths_fn <- scipiper::gd_get('7_config_merge/out/nml_lake_depth_values.rds.ind')
munged_lakedepths <- readRDS(munged_lakedepths_fn)

munged_lakedepths %>% 
  mutate(is_pretty_deep = ifelse(lake_depth >= 100, "Depths >= 100", "Depths < 100")) %>% 
  ggplot(aes(lake_depth)) + geom_histogram() + facet_wrap(vars(is_pretty_deep), nrow=2, scales = "free")

image

Here are the two lakes with depths > 200 m in the 7_config_merge/out/nml_lake_depth_values.rds file

# Jacob pointed out that two lakes have depths > 200 m in our SB metadata file (both are WI lakes)
very_deep_lakes <- munged_lakedepths %>% filter(lake_depth > 200)
very_deep_lakes

# A tibble: 2 x 2
# Rowwise: 
  site_id                                      lake_depth
  <chr>                                             <dbl>
1 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}       228.
2 nhdhr_70333995                                     383.

Are those depths coming from the input WBIC depth file? No, the depths in 4_params_munge/out/wbic_depths.rds match the depths listed on the WI DNR website

# Take a look at those two lake depths from their input depths file
wbic_lakedepths_fn <- scipiper::gd_get('4_params_munge/out/wbic_depths.rds.ind')
wbic_lakedepths <- readRDS(wbic_lakedepths_fn)
problem_wbic_lakedepths <- wbic_lakedepths %>% filter(site_id %in% very_deep_lakes$site_id)

# These are in meters but converting to feet matches the reported lake depths
# on the WI DNR website, so the depths in `nml_lake_depth_values` must come from bathy
#   Spider Lake (29 ft): https://dnr.wi.gov/lakes/lakepages/LakeDetail.aspx?wbic=1586600
#   Echo Lake (18 ft): https://dnr.wi.gov/lakes/lakepages/LakeDetail.aspx?wbic=1597800
problem_wbic_lakedepths %>% mutate(z_max_ft = round(z_max*3.28))

# A tibble: 2 x 4
  site_id                                      WBIC_ID      z_max z_max_ft
  <chr>                                        <chr>        <dbl>    <dbl>
1 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C} WBIC_1597800  5.49       18
2 nhdhr_70333995                               WBIC_1586600  8.84       29

The very deep depths for these two values appear to be coming from the bathymetry data:

# Now bring in the bathy that is used to create H_A (and then ultimately lake_depths)
# The bathy for those lakes goes much much deeper (and more than a simple unit issue
# would be able to explain) and match what is in `nml_lake_depth_values`.
wbic_bathy_fn <- scipiper::gd_get('4_params_munge/out/wbic_bathy.rds.ind')
wbic_bathy <- readRDS(wbic_bathy_fn)

wbic_bathy %>% filter(site_id == very_deep_lakes$site_id[1])
# A tibble: 18 x 3
   site_id                                      depths areas
   <chr>                                         <dbl> <dbl>
 1 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}    0    92.4
 2 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}   13.4  92.3
 3 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}   26.8  92.3
 4 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}   40.2  92.2
 5 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}   53.6  92.1
 6 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}   67.1  92.1
 7 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}   80.5  92.1
 8 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}   93.9  91.9
 9 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  107.   91.9
10 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  121.   91.8
11 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  134.   91.7
12 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  148.   91.6
13 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  161.   91.4
14 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  174.   91.4
15 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  188.   91.1
16 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  201.   90.9
17 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  215.   90.6
18 nhdhr_{D8A9DD23-94AD-4C15-B242-FE519B5EDF8C}  228.   89.9

wbic_bathy %>% filter(site_id == very_deep_lakes$site_id[2])
# A tibble: 18 x 3
   site_id        depths areas
   <chr>           <dbl> <dbl>
 1 nhdhr_70333995    0   201. 
 2 nhdhr_70333995   22.6 163. 
 3 nhdhr_70333995   45.1 157. 
 4 nhdhr_70333995   67.7 144. 
 5 nhdhr_70333995   90.2 128. 
 6 nhdhr_70333995  113.  116. 
 7 nhdhr_70333995  135.  104. 
 8 nhdhr_70333995  158.   95.5
 9 nhdhr_70333995  180.   89.8
10 nhdhr_70333995  203.   82.5
11 nhdhr_70333995  226.   73.1
12 nhdhr_70333995  248.   66.9
13 nhdhr_70333995  271.   55.8
14 nhdhr_70333995  293.   49.4
15 nhdhr_70333995  316.   40.6
16 nhdhr_70333995  338.   34.5
17 nhdhr_70333995  361.   24.4
18 nhdhr_70333995  383.   20.6

To verify there wasn't something happening in our code to munge the raw input files, I checked the values in the input .bth files and they also have this issue:

# Look at the raw bathy inputs
wbic_bathy_files <- sprintf('%s.bth', problem_wbic_lakedepths$WBIC_ID)
unzip('3_params_fetch/in/WBIC_hypsos_lakeattributes.zip', files = wbic_bathy_files)

readr::read_tsv(wbic_bathy_files[1], col_types = 'dd')
# A tibble: 18 x 2                                                                                                                            
   depth  area
   <dbl> <dbl>
 1   0    92.4
 2  13.4  92.3
 3  26.8  92.3
 4  40.2  92.2
 5  53.6  92.1
 6  67.1  92.1
 7  80.5  92.1
 8  93.9  91.9
 9 107.   91.9
10 121.   91.8
11 134.   91.7
12 148.   91.6
13 161.   91.4
14 174.   91.4
15 188.   91.1
16 201.   90.9
17 215.   90.6
18 228.   89.9

readr::read_tsv(wbic_bathy_files[2], col_types = 'dd')
# A tibble: 18 x 2                                                                                                                            
   depth  area
   <dbl> <dbl>
 1   0   201. 
 2  22.6 163. 
 3  45.1 157. 
 4  67.7 144. 
 5  90.2 128. 
 6 113.  116. 
 7 135.  104. 
 8 158.   95.5
 9 180.   89.8
10 203.   82.5
11 226.   73.1
12 248.   66.9
13 271.   55.8
14 293.   49.4
15 316.   40.6
16 338.   34.5
17 361.   24.4
18 383.   20.6
jordansread commented 2 years ago

Let me know if you'd like help resolving this @lindsayplatt

I think these lakes were incorrectly digitized

lindsayplatt commented 2 years ago

@jordansread what is the process we would take to correct them?

jordansread commented 2 years ago

The process originally was highly manual and there is probably no modern way to speed it up.

We'd go to the DNR website for each WBIC (e.g., this one and if it had a digitized map like this we'd look to see if it had a depth vs area or depth vs volume plot, like this:

image

We then used a program called data thief to specify the x and y axis and what the units were, then have it grab that curve. We'd then convert it to the right units.

I'm not suggesting you do this. Instead, I'd recommend deleting those two files from the .zip, then re-zipping the other files and replacing the original file on gdrive, followed by a gd_confirm_posted call to that file which should update the .ind file that represents that gdrive file (assuming you don't change the zip filename). A change to that .ind should trigger rebuilds of all of the downstream impacted files.

I'd recommend making sure your build is clean before running this and also check the count of lakes w/ hypso and that count should drop by 2.