mi3nts / mints-aq-reports

Repository for generation of MINTS automated reports
https://mi3nts.github.io/mints-aq-reports/
1 stars 2 forks source link

Time Scale Identification for Time Series Data #15

Open john-waczak opened 1 year ago

john-waczak commented 1 year ago

One question we seek to answer with our sensor network is: What time scales are relevant to local air quality variability

To that end, let's use this issue to track progress on methods for identifying key temporal scales in our time series data. There are 3 immediate methods that come to mind (there are likely many others we can try out & compare):

1. Temporal Semi-variogram

The semi-variogram $\gamma(\Delta t)$ measures the expected variance in a time series $Z(t)$ as a function of lag $\Delta t$, that is

$$ \gamma(\Delta t) = \frac{1}{2}\mathbb{E}\left[ \left( Z(t) - Z(t+\Delta t) \right)^2 \right] $$

The nice property of the semi-variogram is that it has units of $Z^2$. Therefore, we can both estimate the relevant time scales and get an idea for measurement uncertainty by extrapolating the variogram to $\Delta t = 0$ where we would expect no variance in the absence of instrument uncertainties.

2. Autocorrelation

While the variogram measures the covariance of a signal with a lagged copy of itself, the autocorrelation function measures the correlation of a signal with it self as a function of temporal displacement $\Delta t$. The two nice features of the Autocorrelation is that it can be quickly computed via the Fourier transform and it is scaled to [-1, 1]

3. Spectrogram/Scaleogram

Using the FFT and/or the Continouous Wavelet Transform can visualize the frequency dependence of our time series over a sliding window. See this introduction on wavelets

Other Ideas

john-waczak commented 1 year ago

Interesting comparison paper comparing autocorrelation to semi-variogram

john-waczak commented 1 year ago

Created a repo for time series analysis tools in Julia here. I have a basic variogram fit for regularly spaced time series. We should add a second option for generic time series.