helske / KFAS

KFAS: R Package for Exponential Family State Space Models
64 stars 17 forks source link

Negative K #35

Closed SuYe99 closed 5 years ago

SuYe99 commented 5 years ago

Hello, I am working on applying KFAS for remote sensing data time series (lots of missing values here). I am trying to do one-step prediction for a short-wave band using the codes as follow: `TwoHarmonicsCycle <- SSModel(swir ~ SSMtrend(2, Q = list(matrix(5),matrix(NA)), P1 = matrix(0, 2, 2), a1 = matrix(c(162, 0))) + SSMcycle(period=365.25,Q = matrix(NA)) + SSMcycle(period=365.25/2,Q=matrix(NA)), H = matrix(NA)) # here I intend to get intercept fixed (162 is the results from an initial model)

TwoHarmonicsCycle.fit <- fitSSM(TwoHarmonicsCycle, inits=c(0,0,0,0,0,0),method="BFGS") TwoHarmonicsCycle.out <- KFS(TwoHarmonicsCycle.fit$model,nsim=10000, simplify = FALSE) TwoHarmonicsCycle.predict <- predict(TwoHarmonicsCycle.fit$model, interval="prediction", nsim=10000,level=.9, filtered = TRUE) ` I got the following results: example (The black line is the one-step prediction; the red line is the real observation)

The one-step prediction (black lines) appears to be very wild at the very beginning of the curve (has extreme values), and then converge to an ideal prediction after some point in 2000. I also noticed that some parts of K are negative values at the beginning of the curve, and become all positive after one-step prediction converge. I doubt that negative K is the reason causing the wild curve here. My question are: 1) is there any treatment to such-like the wild curve at the beginning of kalman filtering; 2) if no treatment (kalman filtering may need an initialization phase), can I use 'all K are positive' as a signal for the start of a regular prediction?

helske commented 5 years ago

First, your initial state is not fixed, you should also set P1inf = matrix(0,2,2) in the SSMtrend call. This is somewhat unfortunate feature, by default P1inf is diagonal, and this is overwritten if values of the diagonal of P1 are nonzero. I guess I should make additional condition that if a1 is given then set P1inf to zero...

Second minor thing, you don't need nsim argument as your model is defined as Gaussian.

In general K can be negative (K=PZ, where P is always positive but Z can contain negative values), but in your case K should be positive at least after the diffuse phase (diffuse phase is bit trickier). In case of diffuse initialization (P1inf>0) you should focus on the time points after the initialization (output of KFS contains values d and j which give you the last diffuse time index). But I have noticed that in some cases the diffuse initialization is somewhat unstable with long cycles or similar components, so it might make sense to use some slightly informative prior P1 for the cycle components.

SuYe99 commented 5 years ago

Thank you so much for this guidance! It took me several days to digest it. And yes, when I add P1inf = matrix(0,2,2), the predicted curve becomes much better. Look at the below (black line is the one-step prediction; red line is the original data):

plot23_new

However, as you mentioned, at the early prediction stage, there still exists somewhat very unstable prediction (see the above), which occurs after diffusion stage finished (d=33 at this case). If I can't get informative P1 (which is hard, because P1 is a 6*6 matrix), is there any method to tell that the steady stage has started (or unstable stage ends)? Maybe P1 converges to a steady stage? Durbin's book introduced a method to know if model has been steady (chapter 2.11). But I am not sure if it is the best way. Or is there any easy solution to know prior P1 for the cycle components?

helske commented 5 years ago

I agree that it is hard to be precise on P1 (some advocate actually estimating this at the same time as other model parameters). But if you regard that as a priori covariance matrix for the states, you can try to be conservative but still more informative than fully diffuse approach (using P1inf). So for example you could try something like P1 = diag(1000, 6) or something like that, depending the assumed scale of the cycle, and see how sensitive your results are to that.

It looks like your model does not contain any time-varying system matrices, so P should converge to steady state at some point, assuming that you don't have missing values in your series. Chapter 4.3.4 of DK 2012 book shows you the general steady state equation, you could check if the the current P solves that equation.

SuYe99 commented 5 years ago

Thank you for the patience to answer my questions. Unfortunately, I have lots of missing values in my time series. The ratio of missing vs. existing is around 10:1. Missing observations are very common for remote sensing time series because of cloud issue as well as satellite revisiting cycle. So the equation for Chapter 4.3.4 is for the situations that there are no missing observations?
I am thinking of a practical solution for the equation in Chapter 4.3.4, assuming that we can implement it for a substantial number of missing data. I am considering to check if the matrix of the left side and the matrix of the right side are close enough. Do you have any suggestion for the way to compute the similarity between such two matrix? Threshold?

helske commented 5 years ago

Yes the equation in the book is for the case with no missing observations (I missed the missing data part in the first post). The thing with the missing values is that as P increases every time Kalman filter encounters missing observation, the filter cannot really converge to steady state (of course it can temporarily reach that though if there is long period of complete data).

I tried to simulate some data from your model and it looks like the there are indeed some numerical issues sometimes even without missing data (KFS complains about negative smoothing variances), depending on the model parameters. So one thing you might want to check out is to try to fit the model multiple times with different initial values to be sure you have found global optimum, as it could be that the model fits poorly into the data which leads to numerical issues.

SuYe99 commented 5 years ago

Thank you. Do you mean the initial values for Q and H for fitSSM, or a1 and P1?. The way I am trying now is that to define a 'stable stage' at the very beginning of the time series, and run a simple harmonic regression. This can help me find, if not the best, appropriate a1. Then, start a normal kalman filter right after 'stable stages' using the estimated initial a1 from regression. The model seems to converge much more quickly, and no warnings any more based on my current sample test. But still, I didn't figure out a way to estimate P using 'stable stage'. I used your suggested P1 = diag(1000, 6). Any thought?

SuYe99 commented 5 years ago

I am going to close this issue. I decide to use lasso regression to determine the initial a1. For p1, I used a method of fitting the model twice: first fit using the initial values, and then used the stable p as p1 for the second-time fitting. It seems to work well so far.