DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

Switch depths from LAGOS NE to LAGOS US #329

Closed lindsayplatt closed 2 years ago

lindsayplatt commented 2 years ago

Fixes depth part of #220. To switch them out I did the following:

  1. Uploaded the LAGOS_US_depths.csv file to 3_params_fetch/in and created the appropriate explainer file and added to the source metadata table
  2. Deleted the LAGOS_NE_depths.csv and related inds (no build file since it wasn't a target in a yml anymore (see my added notes to 3_params_fetch/src/fetch_LAGOS.R in this PR for an explanation).
  3. Attempted to run scmake('8_viz/out/lakes_summary_fig.html.ind') to build the pipeline with this new source of data but hit a few snags. See details below.

Build issues! Was getting coop temperature files to rebuild thanks to a tricky issue with Google Drive file permissions that popped up. The result was that the target that builds everytime, 6_temp_coop_fetch/out/coop_all_files.rds.tind, had 4 files that did not get returned because I apparently lacked appropriate permissions to them. Jordan ended up having to re-share the whole 6_temp_coop_fetch/in/ folder with me again just to be sure. I was able to mostly proceed but now stuck on 7a_temp_coop_munge/out/all_coop_dat_linked.feather.ind building and then getting an error on SD_Lake_temp_export.rds:

Now binding 7a_temp_coop_munge/tmp/SD_Lake_temp_export.rds
Restoring previous version of 7a_temp_coop_munge/out/all_coop_dat_linked.feather.ind
Error in FUN(X[[i]], ...) : invalid first argument

Once those issues are solved, I will finish building scmake('8_viz/out/lakes_summary_fig.html.ind') to show the resulting impacts of this new max depth data. Here is the before of the HTML:

image

EDIT ON 4/19/2022: Here is the "after"!

image

lindsayplatt commented 2 years ago

Just went to try rebuilding all_coop_dat_linked.feather again to see if the failure still happened and it did that thing where it dropped some of the coop files in the Google Drive list! Almost like I lost permissions again. Here are the ones that keep dropping:

image

lindsayplatt commented 2 years ago

I just added the xwalk code. I still can't get it to build 8_viz right now (the Google file permissions issue popped up again) but I did look at the intersection of depth + kw files and that count is now a 9,016 (was 8,333 in build of the lake map before the addition of LAGOS-US depth). So, that's almost 700 new GLM model-able lakes!

lake_names_ind <- '2_crosswalk_munge/out/gnisname_nhdhr_xwalk.rds.ind'
lake_data_ind <- '7_config_merge/out/nml_H_A_values.rds.ind'
kw_ind <- "7_config_merge/out/nml_Kw_values.rds.ind"

lake_names <- readRDS(scipiper::sc_retrieve(lake_names_ind))
lake_data <- readRDS(scipiper::sc_retrieve(lake_data_ind))
kw_val_ids <- readRDS(scipiper::sc_retrieve(kw_ind))[["site_id"]]

all_lakes <- unique(lake_names$site_id)
has_zmax <- all_lakes %in% names(lake_data) # any lake in this dataset has zmax
has_kw <- all_lakes %in% kw_val_ids

sum(has_zmax & has_kw)

[1] 9016
lindsayplatt commented 2 years ago

Once this is approved, we need to actually delete the 1_crosswalk_fetch/in/LAGOS_NE_All_Lakes_4ha.zip input file from Google Drive

jordansread commented 2 years ago

I think you will need to snag my PR after merging and that should help you advance.

lindsayplatt commented 2 years ago

Turns out there's a bit more to deciding if a lake is GLM model-able than what was compared in this comment https://github.com/USGS-R/lake-temperature-model-prep/pull/329#issuecomment-1101905970 (before I was able to successfully build the full pipeline).

image

Need to also include the "meteo" constraint in the quick script I did before (add the following, then re-run):

meteo_ind <- "7_config_merge/out/nml_meteo_fl_values.rds.ind"
meteo_files_ind <- "7_drivers_munge/out/7_all_local_drivers.rds.ind"
local_meteo_files <- readRDS(sc_retrieve(meteo_files_ind))[["local_driver"]]
meteo_file_ids <- readRDS(sc_retrieve(meteo_ind)) %>%
  filter(meteo_fl %in% local_meteo_files) %>% pull(site_id)
has_meteo <- all_lakes %in% meteo_file_ids

sum(has_zmax & has_kw)
[1] 9369

sum(has_zmax & has_kw & has_meteo)
[1] 8672
jordansread commented 2 years ago

woooo hoooo! Cool, glad this worked w/o running into the drive or nldas hiccups again. for has_meteo, since the upcoming NLDAS pipeline will be CONUS scale, it should be simple to add more meteo coverage for the missing cells.

lindsayplatt commented 2 years ago

Once this is approved, we need to actually delete the 1_crosswalk_fetch/in/LAGOS_NE_All_Lakes_4ha.zip input file from Google Drive

I just did this ✔

jordansread commented 2 years ago

And others can delete that locally too to free up space