lindsayplatt / salt-modeling-data

Reproducible code for downloading, processing, and modeling data related to river salinization dynamics
0 stars 0 forks source link

Final episodic definitions set #55

Closed lindsayplatt closed 7 months ago

lindsayplatt commented 7 months ago

Went back and revisited #44 again because there were still some sites that were not being flagged as episodic and should have been or vice versa. I adjusted and added some criteria with this PR and the final episodic site count is at 34 (see SpC time series in the attached PDF). The following are the changes I made:

  1. Adjusted our definition of a "global peak". We are using the slopes of normalized SpC data to find local peaks and then filtered to only those that were above the median SpC for each site. This PR adjusts this requirement. Now only local peaks greater than the 75th percentile will qualify as global peaks and be used in our downstream peak calculations.
  2. I was doing something bizarre with how I calculated percent peaks in winter versus non-winter by calculating annual and then overall values. I simplified this to just tally up peaks that occurred in winter and the overall number of peaks per site. Then calculated the percent of the peaks that occurred in winter.
  3. Similar to the above, I was doing something complex with annual calculations and then an overall calculation for the percent differences criteria. So, I switched the percent difference between winter and non-winter to be a simple comparison of a site's mean winter and mean non-winter SpC values.
  4. Finally, I added one last criteria to qualify sites as episodic or not by looking at annual maximum SpC. For each year and site, I calculated maximum SpC for the winter and maximum for the non-winter. Then, I sum the number of years that the maximum in the winter was greater than the maximum in the non-winter. Then calculate a % based on the total number of years used.

all_SpC_episodic_final.pdf

Some notes about the PDF (which I did not bother to annotate or add to the legend):

Code to make the plot (must have built the pipeline) ```r library(targets) library(tidyverse) ts_sc <- tar_read(p3_ts_sc_qualified) ts_sc_peaks <- tar_read(p4_ts_sc_peaks) ts_sc_peak_info <- tar_read(p4_ts_sc_peak_summary) ts_sc_salt_sites <- tar_read(p4_episodic_sites) pdf('all_SpC_episodic_final.pdf', width = 25, height = 12.5) grp_size <- 25 n_grps <- ceiling(nrow(ts_sc_peak_info)/grp_size) for(grp in 1:n_grps) { i_start <- ((grp-1)*25)+1 i_end <- ifelse(grp == n_grps, nrow(ts_sc_peak_info), grp*25) sites_in_grp <- ts_sc_peak_info$site_no[i_start:i_end] ts_sc_sites_in_grp <- ts_sc %>% filter(site_no %in% sites_in_grp) %>% mutate(is_winter = month(dateTime) %in% c(12, 1, 2, 3)) %>% # mutate(is_winter = factor(is_winter, levels = c(TRUE, FALSE))) %>% mutate(year = year(dateTime)) %>% left_join(dplyr::select(ts_sc_peak_info, site_no, peaks_perc_winterYes, sc_perc_diff, perc_winter_max_higher), by = 'site_no') %>% mutate(facet_label = sprintf('%s (%s%% SpC winter peaks, %s%% diff, %s%% winter max)', site_no, round(peaks_perc_winterYes*100, 1), round(sc_perc_diff*100, 1), round(perc_winter_max_higher*100, 1))) # Add info for putting a label in the topleft of each facet ts_sc_sites_info <- ts_sc_sites_in_grp %>% group_by(site_no, facet_label) %>% summarize(maxSpC = max(SpecCond, na.rm=TRUE), minDate = min(dateTime, na.rm = TRUE), medianSpC = median(SpecCond, na.rm = TRUE), SpC75 = quantile(SpecCond, probs = 0.75, na.rm=TRUE), winterMeanSpC = mean(SpecCond[is_winter], na.rm = TRUE), notWinterMeanSpC = mean(SpecCond[!is_winter], na.rm = TRUE), .groups = 'keep') %>% mutate(site_label = ifelse(site_no %in% ts_sc_salt_sites, 'EPISODIC SITE', NA)) p <- ggplot(data = ts_sc_sites_in_grp) + geom_path(aes(x = dateTime, y = SpecCond, color = is_winter, group=year), na.rm = TRUE) + scale_color_manual(values = c(`TRUE` = 'lightblue1', `FALSE` = 'gold')) + theme_bw() + facet_wrap(~facet_label, scales = 'free') + # Add labels if it is an EPISODIC SITE geom_label(data = ts_sc_sites_info, aes(x = minDate, y = maxSpC, label = site_label), fill = 'salmon', hjust=0) + # geom_hline(data = ts_sc_sites_info, aes(yintercept = medianSpC)) + geom_hline(data = ts_sc_sites_info, aes(yintercept = SpC75), linetype = 'dashed') + geom_hline(data = ts_sc_sites_info, aes(yintercept = winterMeanSpC), color = 'skyblue3', linetype = 'dashed') + geom_hline(data = ts_sc_sites_info, aes(yintercept = notWinterMeanSpC), color = 'darkgoldenrod2', linetype = 'dashed') print(p) } dev.off() ```