DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

Add norfork data #268

Closed padilla410 closed 2 years ago

padilla410 commented 2 years ago

This pull request completes the following:

Adding data to all_coop_dat_linked.feather

Run the code below to verify that the Norfork data sets were added to the all_coop_dat_linked.feather

# verification that Norfork data is in munged data set
dat_feather <- read_feather(sc_retrieve('7a_temp_coop_munge/out/all_coop_dat_linked.feather.ind'))

filt_dat_norfork <- dat_feather %>% 
  filter(str_detect(source, "Norfolk_")) %>% 
  group_by(source) %>% 
  summarise(count = n())

Adding the spatial cross walk

The spatial crosswalk for this dataset was created based on descriptive information provided from 6_temp_coop_fetch/explain/Norkfolk_62BRG-XXXX.docx and 6_temp_coop_fetch/explain/Norkfolk_DAM-XXXX.docx. Originally, I had manually assigned an nhdhr id to the data set because it only represents one location (a la PR #218), but that resulted in the omission of the data from all_coop_dat_linked.feather. the Norfork data was omitted because crosswalk_coop_dat() does not have a way to convert an id field (with an nhdhr id) to a site_id field without a crosswalk. In working to debug this issue, I also found a few data sets that likely have the same problem. This has been documented in issue #267.

When completing this step I used the directions in the repo README and in Hayley's Navico PR #210.

Adding the helper function

This function was added to convert_ft_to_m() was added to lib/src/generic_utils.R to ensure consistent unit conversions between data sets.

Successful run of scmake("8_viz")

Snapshot of 8_viz/out/lakes_summary_fig.html: image

Note: there are no additional lakes listed in this figure (when compared to PR #263) because both the Norfork and Navico datasets contain data for Norfork Lake in Northwest Arkansas.

# verification that Norfork data is in munged data set
dat_feather <- read_feather(sc_retrieve('7a_temp_coop_munge/out/all_coop_dat_linked.feather.ind'))

count_by_dataset <- dat_feather %>% 
  filter(str_detect(source, "Norfolk_") | str_detect(source, "Waterbody_")) %>% 
  group_by(source) %>% 
  summarise(count = n())