cerlymarco / tsmoothie

A python library for time-series smoothing and outlier detection in a vectorized way.
MIT License
738 stars 99 forks source link

Question on KalmanSmoother usage #7

Closed turmeric-blend closed 3 years ago

turmeric-blend commented 3 years ago

Hi, I have a time-series that has seasonality at certain time windows (lets call it sw) and no seasonality at other windows (lets call it nsw). I plan to pass random windows of this time-series into the smoother.

I am trying to use KalmanSmoother and is considering between:

smoother1 = ts.smoother.KalmanSmoother(component='level_trend_season', 
                                       component_noise={'level':0.1, 'trend':0.1, 'season':0.1})

vs

smoother2 = ts.smoother.KalmanSmoother(component='level_trend', 
                                       component_noise={'level':0.1, 'trend':0.1})

If the random window slice is sw, the smoother1 should work just fine, and at nsw cases, smoother2 should work better. However I can only use one smoother.

My question is if I pass nsw into smoother1, will it degrade performance as compared to if pass nsw to smoother2? Is the smoother1 smart enough to "ignore" the fact that nsw has no seasonality in its time-series?

cerlymarco commented 3 years ago

Hi,

This is more like an opinion-based question related to your domain of analysis.

I think that here the best thing to do is take some series u have at your disposal and use nsw into smoother1 and nsw into smoother2. in this way, u end with 2 smoothed results that you can inspect. if the impact of the seasonal component is strong you should adjust its impact in the component_noise or also evaluate to remove the seasonal component. You can also use two separate smoothers if u are able to detect in some deterministic way the separation between nsw and sw

turmeric-blend commented 3 years ago

if the impact of the seasonal component is strong you should adjust its impact in the component_noise

I am not so familiar with Kalman Filters, may I ask what does component_noise parameter represent?

For example, is it the degree of noise I expect my time-series to have? So if I set component_noise of trend=0.01, I am telling the KalmanSmoother that my time-series data has low noise in the trend component, and if instead trend=0.7, I am telling that my time-series data has high noise in the trend component? This is just my guess. Appreciate it if you would help me clarify this. Thanks!

cerlymarco commented 3 years ago

This not about Kalman Filter, it's more related to Unobserved Components Model (UCM), here a general example. Tsmoothie uses the kalman filter to operate smoothing building a UCM. In general, If you are confident about the presence of a particular component (trend, seasonality, etc.) u set a low sigma otherwise you set a high sigma.

turmeric-blend commented 3 years ago

sigma being the component_noise?

tawdes commented 2 years ago

Hi Marco I am new to Tsmootie and also Kalman filters. In a process to understand. Have a doubt about component_noise. I have time series where daily seasonality is prominent. So mostly component noise: season= 0.1 works well (low sigma value as I am confident about daily seasonality). But I have tried values like 0.01 and 1 also for the same. I want to know is there any limits/ range for the sigma values of component_noise? i.e. 0 to 1 (0 to 100%) etc