DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

QAQC of depths in this dataset #227

Open jordansread opened 2 years ago

jordansread commented 2 years ago

Depth is a critical lake attribute, and we've put a ton of work into aggregating sources of lake hysography and/or depth, but we haven't dug into the oddities of some of these values in a long time, probably dating back to Luke's work merging parts of bathybase with LAGOS and finding some units weren't correctly entered in some of the sources (ft as meters, I think).

And this email from Keenan:

... first I wanted to ask if you knew anything about where the depths come from. Some of their numbers seem very odd. The MGLP data claims that Spider Lake, WI is 383 meters deep, but this page from their DNR says 29 feet. Spot checking in some shallower lakes vs. online sources show 10s of meters difference.

jordansread commented 2 years ago

I'm digging into this Spider Lake case because it is clearly not 383m deep (but appears that deep in the dataset)

here is the contour map: https://dnr.wi.gov/lakes/maps/DNR/1586600a.pdf

WBIC crosswalk via

readRDS('2_crosswalk_munge/out/wbic_nhdhr_xwalk.rds') %>% filter(site_id == 'nhdhr_70333995')
# A tibble: 1 × 2
  WBIC_ID      site_id       
  <chr>        <chr>         
1 WBIC_1586600 nhdhr_70333995

Depth from WBIC depths dataset is 29m (not feet?)

readRDS('4_params_munge/out/wbic_depths.rds') %>% filter(site_id == 'nhdhr_70333995')
# A tibble: 1 × 3
  site_id        WBIC_ID      z_max
  <chr>          <chr>        <dbl>
1 nhdhr_70333995 WBIC_1586600    29

The faulty data came from wbic_bathy.rds

readRDS('../lake-temperature-model-prep/4_params_munge/out/wbic_bathy.rds') %>% filter(site_id == 'nhdhr_70333995')
# A tibble: 18 × 3
   site_id        depths areas
   <chr>           <dbl> <dbl>
 1 nhdhr_70333995    0   201. 
 2 nhdhr_70333995   22.6 163. 
 3 nhdhr_70333995   45.1 157. 
 4 nhdhr_70333995   67.7 144. 
 5 nhdhr_70333995   90.2 128. 
 6 nhdhr_70333995  113.  116. 
 7 nhdhr_70333995  135.  104. 
 8 nhdhr_70333995  158.   95.5
 9 nhdhr_70333995  180.   89.8
10 nhdhr_70333995  203.   82.5
11 nhdhr_70333995  226.   73.1
12 nhdhr_70333995  248.   66.9
13 nhdhr_70333995  271.   55.8
14 nhdhr_70333995  293.   49.4
15 nhdhr_70333995  316.   40.6
16 nhdhr_70333995  338.   34.5
17 nhdhr_70333995  361.   24.4
18 nhdhr_70333995  383.   20.6

Which come from 3_params_fetch/in/WBIC_hypsos_lakeattributes.zip

^these files were digitized back in 2014 and I have no idea why this particular file/lake would be so far off.

jordansread commented 2 years ago

Spider lake is correct in WI_Waterbodies.tsv but is not being converted to meters in munge_wbic_depths!

read_tsv('3_params_fetch/in/WI_Waterbodies.tsv') %>% filter(WBIC == 1586600)
Rows: 17136 Columns: 14                                                                                                    
 0s── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr (7): OFFICIAL_NAME, COUNTY_NAME, OFFICIAL_SIZE, OFFICIAL_SIZE_UNIT, OFFICIAL_MAX_DEPTH, OFFICIAL_MEAN_DEPTH, WATERBOD...
dbl (7): WBIC, OFFICIAL_SIZE_WITH_OUT_UNITS, LATITUDE, LONGITUDE, SAND_PCT, GRAVEL_PCT, MUCK_PCT

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 1 × 14
    WBIC OFFICIAL_NAME COUNTY_NAME OFFICIAL_SIZE OFFICIAL_SIZE_W… OFFICIAL_SIZE_U… OFFICIAL_MAX_DE… OFFICIAL_MEAN_D… LATITUDE
   <dbl> <chr>         <chr>       <chr>                    <dbl> <chr>            <chr>            <chr>               <dbl>
1 1.59e6 Spider Lake   Oneida      123.07 ACRES              123. ACRES            29 FEET          13 FEET              45.8
# … with 5 more variables: LONGITUDE <dbl>, SAND_PCT <dbl>, GRAVEL_PCT <dbl>, MUCK_PCT <dbl>, WATERBODY_SOURCE <chr>
jordansread commented 2 years ago

Note we've got a z_max of zero here too somehow.

jordansread commented 2 years ago

And several more z_max = 0 here