DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

Add Three Bull Shoals Datasets from Issue 320 #337

Closed padilla410 closed 2 years ago

padilla410 commented 2 years ago

This PR is a partial solution to #320 (this table) and adds data from the following cooperator files:

The data in these files comes from two locations: Bull Shoals Reservoir and Lake of the Ozarks. The Bull Shoals data was successfully added to the pipeline, while the Lake of the Ozarks data was not, due to the unresolved issue #289.

A case_when question

I have a specific question for @jread-usgs. While working on these parsers I was trying to find spots to use regular expressions and case_when. Specifically, I tried six ways from Sunday to integrate this snippet into this portion of a dplyr pipeline.

Specifically, I thought this should work:

# read in data and clean
clean <- list_sheets %>%
  purrr::map_df(~ readxl::read_xlsx(path = file_path, sheet = .x)) %>%
  purrr::discard(~all(is.na(.x))) %>%  # remove columns with all NA
  dplyr::filter(!is.na(Timestamp)) %>% # remove rows without timestamp
  dplyr::rename_with(~ new_col_names) %>%
  mutate(

    DateTime = case_when(
      class(DateTime) == 'character' ~ mdy_hm(DateTime) %>% as.Date,
      all(class(DateTime)) != 'character' ~ as.Date(DateTime),
    ),

    Timezone = c('CDT/CST'),
    temp = fahrenheit_to_celsius(temp),
    depth = -1 * depth,
    Missouri_ID = 'Missouri_100' # value from Univ MO xwalk (https://drive.google.com/file/d/11w6-LXCDSDCjipFYPxUgJyf7YB9BtXYR/view?usp=sharing)
  ) %>%
  dplyr::select(DateTime, Timezone, depth, temp, site, Missouri_ID)

The reason I need to have conditional formatting for DateTime is because the DateTime field reads as either character or POSIXct for each of the 14 files in 6_temp_coop_fetch/in/Temp_DO_BSL_MM_DD_YYYY.zip.

Results from scmake("8_viz")

Bull Shoals BEFORE image

Bull Shoals AFTER image

Summary numbers (no change compared to #333): image

lindsayplatt commented 2 years ago

@padilla410 I am reviewing this now. Looking at your question about the use of case_when(), based on your prompt for it, it sounds like what you have does not work and you expected it to. Do you have any other info about what isn't working? That would help me troubleshoot a working solution.