Open kwythers opened 5 years ago
I might have figured this out... Turned out that I was running the time decompose on data with multiple observations on a single day. Once I filtered down to single locations and sites where only one observation per day was recorded, I lost the error.
I have arranged my own data into as close a tibble to the "tidyverse_cran_downloads" demonstration data as possible:
class(tidyverse_cran_downloads) [1] "grouped_tbl_time" "tbl_time" "grouped_df" "tbl_df" "tbl" "data.frame"
glimpse(tidyverse_cran_downloads) Observations: 6,375 Variables: 3 Groups: package [15] $ date 2017-01-01, 2017-01-02, 2017-01-03, 2017-01-04, 2017-01-05, 2017-01-06, 2017-01-07, 2017-01-08, 2017-01-09...
$ count 873, 1840, 2495, 2906, 2847, 2756, 1439, 1556, 3678, 7086, 7219, 0, 5960, 2904, 2854, 5428, 6358, 6973, 661...
$ package "tidyr", "tidyr", "tidyr", "tidyr", "tidyr", "tidyr", "tidyr", "tidyr", "tidyr", "tidyr", "tidyr", "tidyr",...
As you can see, both 'class()' and 'glimpse()' show very similar structures. I can replicate the results with the demonstration data just fine. However, when I try and apply the 'time_decompose()' function to my data (isw_tss), I get the "Only year, quarter, month, week, and day periods are allowed for an index of class Date" error message.
I am confused by this as my date data are in the ymd format (same as the demonstration data). Any thoughts would be much appreciated.
I have attached a sample data file isw_tss.txt
Here is the code I have modified up to the error message bits:
load libraries
library(tidyverse) library(tidyquant) library(lubridate) library(ggplot2) library(ggpubr) library(anomalize) library(tibbletime)
read in the data
isw_dmr <- read_csv('C:\Users\kwyther\export_isw_dmr.csv')
change to lower case and remove rows with no reported value
isw_dmr <- rename_all(isw_dmr, tolower) %>% drop_na(reported_value)
change sample_dates to date
isw_dmr$sample_date <- dmy(isw_dmr$sample_date)
list of paramters
params <- isw_dmr %>% distinct(parameter_name)
list of staff entering data
staff <- isw_dmr %>% distinct(staff_id_last_updt)
simplify by parameter
tss
isw_tss <- isw_dmr %>% select(sample_date, reported_value, parameter_name, staff_id_last_updt) %>% filter(parameter_name == 'Solids, Total Suspended (TSS)')
isw_tss <- isw_tss %>% group_by(staff_id_last_updt) %>% as_tbl_time(sample_date)
isw_tss <- isw_tss %>% arrange(sample_date, .by_group = TRUE)
isw_tss %>% ggplot(aes(sample_date, reported_value)) + geom_point(color = "#2c3e50", alpha = 0.25) + facet_wrap(staff_id_last_updt ~ .) + theme_minimal() + theme(axis.text.x = element_text(angle = 30, hjust = 1)) + labs(title = "TSS reported values by staff", subtitle = "Data from ISW_DMRs")
isw_tss %>%
Data Manipulation / Anomaly Detection
time_decompose(reported_value, method = "stl") %>% anomalize(remainder, method = "iqr") %>% time_recompose() %>%
Anomaly Visualization
plot_anomalies(time_recomposed = TRUE, ncol = 3, alpha_dots = 0.25) + labs(title = "TSS Anomalies", subtitle = "STL + IQR Methods")