google / bayesnf

Bayesian Neural Field models for prediction in large-scale spatiotemporal datasets
https://google.github.io/bayesnf/
Apache License 2.0
44 stars 3 forks source link

Follow up questions on making use of seasonality #50

Open f0lie opened 1 month ago

f0lie commented 1 month ago

https://google.github.io/bayesnf/tutorials/BayesNF_Tutorial_on_Hungarian_Chickenpox/

from bayesnf.spatiotemporal import BayesianNeuralFieldMAP

model = BayesianNeuralFieldMAP(
  width=256,
  depth=2,
  freq='W',
  seasonality_periods=['M', 'Y'], # equivalent to [365.25/12, 365.25]
  num_seasonal_harmonics=[2, 10], # two harmonics for M; one harmonic for Y
  feature_cols=['datetime', 'latitude', 'longitude'], # time, spatial 1, ..., spatial n
  target_col='chickenpox',
  observation_model='NORMAL',
  timetype='index',
  standardize=['latitude', 'longitude'],
  interactions=[(0, 1), (0, 2), (1, 2)],
  )
 {(cos(2πh/pt),sin(2πh/pt)); p ∈ P, h ∈ Htp} Temporal Seasonal Features (8)

 In Eq. (8), the temporal seasonal
features are defined by a set P = {p1, . . . , pℓ} of ℓ seasonal periods, where each pi has harmonics
Ht
pi ⊂ {1, 2, . . . , ⌊pi/2⌋} for i = 1, . . . , ℓ. For example, if the time unit is hourly data and there are
m = 2 seasonal effects (daily and monthly), the corresponding periods are p1 = 24 and p2 = 730.5,
respectively.

It seems like for seasonality_periods, we are specifying the periods of seasonal patterns. And for num_seasonal_harmonics we are specifying the number of seasonal patterns.

If we are adapting to an unknown dataset, we would need to plot the data on a chart, look at the period of possible patterns for seasonality_periods, and then count the number of periods for num_seasonal_harmonics.

For the "London Air Quality Tutorial", I am struggling to see why would use seasonality_periods=['D', 'W'], # Daily and weekly seasonality, same as [24, 24*7] num_seasonal_harmonics=[4, 4], # Four harmonics for each seasonal factor from the charts alone.

It is possible to automate this process so we can pick up seasonality without needing to manually look at the data?

If we have time series of inconsistent length, lets say some of them we have 10 years of data and other times we have 5 years of data, wouldn't that break num_seasonal_harmonics? Maybe it's only an issue if the period of the different locations is actually different from each other.

fsaad commented 1 month ago

This table shows the possible seasonality_periods, for various measurement frequencies.

image

For example, if the dataset is measured hourly, then you might posit daily and weekly seasonal effects, which gives seasonality_periods = [24, 168]. As for num_seasonal_harmonics, it usually suffices to use a small number for each seasonal period (e.g., 5 or 10 harmonics), unless there is a strong reason to suspect the existence of higher frequency harmonics.

For the "London Air Quality Tutorial", I am struggling to see why would use seasonality_periods=['D', 'W'], # Daily and weekly seasonality, same as [24, 24*7] num_seasonal_harmonics=[4, 4], # Four harmonics for each seasonal factor from the charts alone.

There is generally not much harm in including extra seasonal components, since these are used as input features to the Bayesian neural network. The model can learn to ignore these extra input features, if they are not predictive of the observed data.

It is possible to automate this process so we can pick up seasonality without needing to manually look at the data?

BayesNF does not support automatically discovering seasonal frequencies and harmonics. For 1D time series, you can see the AutoGP.jl method, which is described in this paper.

If we have time series of inconsistent length, lets say some of them we have 10 years of data and other times we have 5 years of data, wouldn't that break num_seasonal_harmonics? Maybe it's only an issue if the period of the different locations is actually different from each other.

Right, for determining the seasonal effects, the more important quantity is the frequency of the observation (e.g., hourly, weekly, monthly, yearly, etc). Having different time series lengths at different locations does not cause a "type-mismatch". Having different observation frequencies at different locations, e.g., location A has hourly data and location B has minutely data, is a bit more challenging. This mismatch could be aligned by using the lowest common denominator (i.e., minutely) as the observation frequency, provided that fractional time steps in location A has meaningful semantics (e.g., for a temperature time series).

A case in which the mismatch cannot be fixed would be if you have a dataset for "number of home sales", where location A has daily observations (i.e., sum of all home sales that day) and location B has monthly observations (sum of all home sales that month). In this case, it does not make sense to take fractional observations at location B.

f0lie commented 1 month ago

As for num_seasonal_harmonics, it usually suffices to use a small number for each seasonal period (e.g., 5 or 10 harmonics), unless there is a strong reason to suspect the existence of higher frequency harmonics.

I am still trying to comprehend num_seasonal_harmonics. Reading the equation, it seems like it increasing the 'num_seasonal_harmonics' decreases the period of the cos and sin function. So we could equivalently reduce the "seasonality_periods" as well.

So suppose we have a dataset of hourly temperature, if we see a daily spike in the morning and evening, we would put seasonality_periods = ['D'] and then num_seasonal_harmonics = 2. I am I understanding this correctly?

Right, for determining the seasonal effects, the more important quantity is the frequency of the observation (e.g., hourly, weekly, monthly, yearly, etc). Having different time series lengths at different locations does not cause a "type-mismatch". > Having different observation frequencies at different locations, e.g., location A has hourly data and location B has minutely data, is a bit more challenging. This mismatch could be aligned by using the lowest common denominator (i.e., minutely) as the observation frequency, provided that fractional time steps in location A has meaningful semantics (e.g., for a temperature time series).

I was thinking more of that some locations have seasonal effects and other locations don't have seasonal effects.

For example, one location office location works 6 days a week, another office location works 3 days at the office and the rest 2 days at home. So even though the data could be the same, lets say commuted miles per day, the seasonal patterns would be very different.

There is another case where the seasonal patterns are off set from each other. Like temperature across the global and seasons. Since the north and south hemispheres of the Earth have opposite seasons, the patterns of temperature would be the opposite.

It seems like the multiple layers of the model should be able to capture nuanced cases like this. Since the seasonality is a function and that function is learned at multiple layers, so the model can prune if some seasonality doesn't apply to some locations.