Smoothing gain in KF smoother

narykov commented 5 months ago

I believe there is a typo in how the smoothing gain is implemented for the KF smoother: https://github.com/dstl/Stone-Soup/blob/0a9ebe2584eb6cd3d77756fd9dfa7b98649799fd/stonesoup/smoother/kalman.py#L131-L148

Basically, the function aims to return the definition [1] $$G{k} = P{k} F{k}^{T} (P{k+1|k})^{-1},$$ but the implementation instead evaluates this: $$G{k} = P{k} F{k-1}^{T} (P{k+1|k})^{-1}.$$

Specifically, the second term self._transition_matrix(state, **kwargs) responsible for $F{k}$ obtains it from the current state by looking into state.hypothesis.prediction, i.e., a prediction from $k-1$ to $k$, which contains $F{k-1}$ and not $F_{k}$. However, if the function was implemented using prediction, which describes the prediction from $k$ to $k+1$, like this self._transition_matrix(prediction, **kwargs), it would actually access $F_{k}$ as required by the definition.

[1] See Algorithm 1 in https://livrepository.liverpool.ac.uk/3015339/1/PL_smoothing_accepted1.pdf

sdhiscocks commented 5 months ago

Also looking at the reference used in the code, and comparing to the reference you've mentioned, I think the current code is correct.

In the code state is the state at $k$, and subsq_state is $k+1$.

The code gets the prediction from the $k+1$ state subsq_state which should be the prediction from $k$ to $k+1$, so prediction.covar is $P_{k+1}^{-}$. The self._transition_matrix(state, **kwargs) should correctly be $F_{k}$.

narykov commented 5 months ago

Hi Steven, thanks for looking into this.

TL;DR: We are interested in getting matrix $F$ that carries the state from $k$ into $k+1$. Let's assume state is the system's prior at $k=0$_, and prediction, or subsq_state, is the first measurement update that happens at_ $k=1$. Trying to access $F$ _with self._transition_matrix(state, **kwargs) will yield nothing as prior is not endowed with hypothesis. On the contrary, using self._transition_matrix(prediction, **kwargs) will produce the desired quantity._

My understanding is that state is the updated state $(x_k, Pk)$ at time $k$ that has prediction information $(x{k}^{-}, P_{k}^{-})$ stored in its state.hypothesis.prediction, which is [1, Eq. (4.20)]:

\begin{align}
x_{k}^{-}&=F_{\color{red}{k-1}}x_{k-1},
\\
P_{k}^{-}&=F_{\color{red}{k-1}}P_{k-1}F_{\color{red}{k-1}}^{T}+Q_{\color{red}{k-1}},
\end{align}

meaning it misses desired information on $F_k$ or $Qk$ (only on $F{k-1}$ and $Q_{k-1}$). Subsequently, if we wish to obtain $F_k$ and $Q_k$, we need to look into subsq_state, i.e., $(x{k+1}, P{k+1})$ at time $k+1$, (referred to as prediction in the script above) and access its prediction $(x{k+1}^{-}, P{k+1}^{-})$ stored in subsq_state.hypothesis.prediction, which will then contain $F_k$:

\begin{align}
x_{k+1}^{-}&=F_{k}x_{k},
\\
P_{k+1}^{-}&=F_{k}P_{k}F_{k}^{T}+Q_{k}.
\end{align}

In other words, right after executing the update at time $k$ we have no information about what $F_k$ will be used to carry the system into the next time step, it will only be recoverable from the subsequent state at $k+1$.

Important: I believe one source of confusion is the erroneous notations that are used sometimes to write down the prediction. Specifically, note that the equations of Kalman prediction above are different from those found in the Stone Soup tutorial [2], on the Wikipedia page [3] or in other authoritative sources [4]. This concerns the time indices that I highlighted in red. These references contradict the established maths, such as those in [1] or in reference [5] provided in the tutorial.

Let me know if it is not clear and you'd prefer me to put together a piece of code that would demonstrate this behaviour, and I would try to come up with something.

[1] Algorithm 1 in https://livrepository.liverpool.ac.uk/3015339/1/PL_smoothing_accepted1.pdf or Eq. (4.20) in https://users.aalto.fi/~ssarkka/pub/cup_book_online_20131111.pdf [2] https://stonesoup.readthedocs.io/en/latest/auto_tutorials/01_KalmanFilterTutorial.html#construct-a-kalman-filter [3] https://en.wikipedia.org/wiki/Kalman_filter#Details [4] Eq. (11) in https://people.eecs.berkeley.edu/~pabbeel/cs287-fa13/optreadings/Arulampalam_etal_2002.pdf [5] Eq. (1.1) in http://users.cecs.anu.edu.au/~john/papers/BOOK/B02.PDF

sdhiscocks commented 5 months ago

Thanks for clarifying @narykov. I can see the issue now. I misunderstood and thought it was around which state would be used to calculate the Jacobian; but I can see the issue is the incorrect model. So the code currently is using the correct state to calculate the Jacobian, but in combination with the wrong model.

I think also that self._transition_matrix(prediction, **kwargs) will also produce the incorrect results as this will switch the problem to using the correct model, but using the wrong state to calculate the Jacobian. (If you dealing with linear model, this isn't an issue).

I've opened #945 which I believe may fix this.

jmbarr commented 5 months ago

but using the wrong state to calculate the Jacobian.

I'm not sure it is. If you accept that you get $F_{k \rightarrow k+1}$ from the prediction of the subsequent state then you get the linearisation about $\muk$, i.e. at $k$. Which is what you want, no? If you go to state (the current state) then you'll get $F{k-1 \rightarrow k}$, which gives the Jacobian at $k-1$.

(I'm sure you've spotted already that @narykov is using the (Sarkka) style where $F{k}$ indicates 'transition from $k$ to $k+1$' and the Jacobian is therefore defined as $\mathbf{\nabla f} | \{x=\mu_k}$ . This is in contrast to Stone Soup which uses $Fk$ to mean 'the transition from $k-1$ to $k$', which I think is more common but has the confusing property that the linearisation is done at $x=\mu{k-1}$).

sdhiscocks commented 5 months ago

but using the wrong state to calculate the Jacobian.

I'm not sure it is. If you accept that you get Fk→k+1 from the prediction of the subsequent state then you get the linearisation about μk, i.e. at k. Which is what you want, no? If you go to state (the current state) then you'll get Fk−1→k, which gives the Jacobian at k−1.

So yes, the proposed change I believe is now correctly gets $F_{k \rightarrow k+1}$ like you say: the transition model from the prediction, and will do linearisation about the state ($\mu_k$).

jmbarr commented 5 months ago

So yes, the proposed change I believe is now correctly gets Fk→k+1 like you say: the transition model from the prediction, and will do linearisation about the state (μk).

You mean proposed change #945? But just changing self._transition_matrix(state, **kwargs) to self._transition_matrix(prediction, **kwargs) in _smooth_gain() does all that.

jmbarr commented 5 months ago

So yes, the proposed change I believe is now correctly gets Fk→k+1 like you say: the transition model from the prediction, and will do linearisation about the state (μk).

You don't want to do linearisation about state -- that's at $\mu{k-1}$. There's also no need to invoke extra code to use the "transition matrix from state but at $\mu{k}$" because you can get that from prediction directly.

sdhiscocks commented 5 months ago

So yes, the proposed change I believe is now correctly gets Fk→k+1 like you say: the transition model from the prediction, and will do linearisation about the state (μk).

You mean proposed change #945? But just changing self._transition_matrix(state, **kwargs) to self._transition_matrix(prediction, **kwargs) in _smooth_gain() does all that.

If you do that, it results in a second issue with EKF, as the code will now do self._transition_model(prediction).jacobian(prediction, **kwargs)

jmbarr commented 5 months ago

If you do that, it results in a second issue with EKF, as the code will now do self._transition_model(prediction).jacobian(prediction, **kwargs)

which gives you $F{k\rightarrow k+1} |_{\mu_{k+1|k}}$ as opposed to $F{k \rightarrow k+1} |_{\mu_{k|k}}$ I suppose. Is that the issue? In which case why not modify _smooth_gain() to take subsq_state instead of prediction and get prediction from subsq_state once inside?

sdhiscocks commented 5 months ago

Yes, hence need to make more changes for EKF smoother, that apparent for the just KF smoother. _smooth_gain() also needs prediction as well as code outside that function call, so makes sense to just pass prediction and call method to fetch it from subsq_state once.

@narykov Could you take a look/test #945 and confirm if it resolves this issue for you?

dstl / Stone-Soup

Smoothing gain in KF smoother #941