ropensci / tsbox

tsbox: Class-Agnostic Time Series in R
https://docs.ropensci.org/tsbox
149 stars 12 forks source link

Known frequencies: 1 week #226

Open mivazq opened 2 months ago

mivazq commented 2 months ago

Hi

Why is "1 week"/"7 days" not included as a standard frequency in the table used to check against for usual frequencies? meta_freq_data

If possible, could it be added?

Best Miguel

christophsax commented 2 months ago

I don't know, to be honest. Do you have an example where this is useful? We could use this for a test, then.

mivazq commented 2 months ago

A MWE can be given by the weekly economic activity indicator published by SECO.

Here is some code:

library(tidyverse)
library(data.table)
library(tsbox)
library(dplyr)

# Example of weekly data
url <- "https://www.seco.admin.ch/dam/seco/de/dokumente/Wirtschaft/Wirtschaftslage/indikatoren/wwa.csv.download.csv/wwa.csv"
raw_data <- fread(url)
data_week <- raw_data %>%
  as_tibble() %>% 
  filter(structure == "seco_wwa") %>% 
  mutate(time = as.Date(date, format = "%m.%Y.%d")) %>% 
  select(id = structure, time, value)

# Example of quarterly data
url <- "https://www.seco.admin.ch/dam/seco/en/dokumente/Wirtschaft/Wirtschaftslage/BIP_Daten/ch_seco_gdp_csv.csv.download.csv/ch_seco_gdp.csv"
raw_data <- fread(url)
data_quarter <- raw_data %>%
  as_tibble() %>% 
  filter(structure == "gdp" & seas_adj == "csa" & type == "nom") %>% 
  mutate(time = as.POSIXct(date, format = "%m.%Y.%d")) %>% 
  select(id = structure, time, value)

# Convert to ts object and extract time dimension
weeks <- ts_ts(data_week) %>% tsbox:::ts_to_date_time()
quarters <- ts_ts(data_quarter) %>% tsbox:::ts_to_date_time()

# When printing, quarters are correctly displayed as dates, whereas weeks are 
# timestamps. I assume it's because "week" is not a pre-defined frequency so 
# the timeline is simply divided by the number of observations
head(weeks)
head(quarters)

# One can see here that the start of the weekly data is set at "2005.00548" instead
# of simply 2005. Because the calculated timestamp is taken as reference instead 
# of using a pre-constructed standardize measure of time.
tsbox:::date_time_to_tsp(weeks)
tsbox:::date_time_to_tsp(quarters)

# Frequency correctly identified for quarters, not for weeks
tsbox:::frequency_table(data_week$time)
tsbox:::frequency_table(data_quarter$time)
mivazq commented 2 months ago

As a follow up, it is important to ensure that implemented solution uses a tolerance such that weekly frequency can be correctly detected in the following cases:

Thanks a lot for your work :)

mivazq commented 1 month ago

Dear Developers

I am following up again on this issue.

I have come to the realisation that if you correct the bugs mentioned in my other issue #228, the problem of not having a recognised weekly frequency become of much less importance. After the correction, one can just take the date of the timestamp with as.Date() and easily work with that. Until now this is highly problematic as one gets wrong dates due to the bugs in #228.

Regardless, if you would still be keen to include a specialised solution to include weekly data as a standard, recognised frequency, the solution would need following adjustments:

Further tweeks might be needed for the special cases described in my previous comment.

Have a good evening!