Closed andreranza closed 10 months ago
Something to build upon as an additional question in this Exercise section.
library(seasonal)
set.seed(2024)
n_months <- 12
tot_len <- n_months * 5
ev_const <- 1.40 # tweak me
# Creating a time series with monthly frequency
dates <- seq(as.Date("2018-01-01"), by = "month", length.out = tot_len)
data <- rnorm(tot_len, mean = 8, sd = 1)
seasonal_component <- sin(2 * pi * (1:tot_len) / n_months)
# Introducing an outlier in the first time series
data_with_outlier <- data
data_with_outlier <- data_with_outlier + seasonal_component
data_with_outlier[49] <- data_with_outlier[49] + 10
ts_with_outlier <- ts(data_with_outlier, start = c(2018, 1), frequency = n_months)
# Introducing extreme values in the second time series
data_with_extreme_values <- data
data_with_extreme_values <- data_with_extreme_values + seasonal_component
data_with_extreme_values[49] <- data_with_extreme_values[49] * ev_const
# Creating second ts object
ts_with_extreme_values <- ts(
data_with_extreme_values,
start = c(2018, 1),
frequency = n_months
)
# Plotting
par(mfrow = c(2, 1))
# Plot for time series with an outlier
plot(
ts_with_outlier,
main = "Time Series with an Outlier in 2022",
ylab = "Value",
xlab = "Date",
col = "blue"
)
# Plot for time series with extreme value
plot(
ts_with_extreme_values,
main = "Time Series with EV in 2022",
ylab = "Value",
xlab = "Date",
col = "green"
)
m_out <- seas(ts_with_outlier)
summary(m_out)
#>
#> Call:
#> seas(x = ts_with_outlier)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> Constant 2.05258 0.02983 68.814 < 2e-16 ***
#> AO2022.Jan 0.78994 0.15053 5.248 1.54e-07 ***
#> AR-Nonseasonal-01 0.43859 0.31377 1.398 0.162
#> MA-Nonseasonal-01 0.16216 0.34496 0.470 0.638
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> SEATS adj. ARIMA: (1 0 1) Obs.: 60 Transform: log
#> AICc: 206, BIC: 215.4 QS (no seasonality in final):0.2021
#> Box-Ljung (no autocorr.): 32.32 Shapiro (normality): 0.9438 **
#> Messages generated by X-13:
#> Warnings:
#> - At least one visually significant seasonal peak has been found
#> in one or more of the estimated spectra.
m_ext <- seas(ts_with_extreme_values)
summary(m_ext)
#>
#> Call:
#> seas(x = ts_with_extreme_values)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> MA-Nonseasonal-01 0.90142 0.05522 16.32 < 2e-16 ***
#> MA-Seasonal-12 0.99932 0.16195 6.17 6.81e-10 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 60 Transform: log
#> AICc: 175.8, BIC: 180.8 QS (no seasonality in final): 0
#> Box-Ljung (no autocorr.): 31.12 Shapiro (normality): 0.9896
#> Messages generated by X-13:
#> Notes:
#> - The program cannot perform hypothesis tests for kurtosis on
#> less than 50 observations.
Created on 2024-01-17 with reprex v2.0.2
I wonder if this would be more of a Case Study than an exercise.
Possible questions to be asked on the reprex below:
ev_const
so the program recognize obs 49
as an outlier?ev_const = 1.15
. observe what happens to the chosen transform. Can you explain why there is a difference?@christophsax and @jlivsey, how do you feel about those? I would not be able to answer number 5
There's a more in-depth discussion on the difference between EV and outliers in the X11 chp tough. Maybe this would fit better there.
For @jlivsey to decide.
WIP