Chapter 7: Piecewise White Noise Model for first order system

JeanLuc001 commented 3 years ago

In chapter 7, section "Piecewise White Noise Model" it is stated that the "highest order term (say, acceleration) is constant for the duration of each time period, but differs for each time period". However, in the derivation of the Q matrix for the first order model it is pretended as if there was an underlying acceleration which is random, not a random velocity, that would be the highest order term in the filter.

Making the velocity random instead of the acceleration would be in agreement with the ConstantVelocityObject class in chapter 8, where the velocity - not the acceleration - is modeled as noisy variable.

So, instead of

shouldn't it be

and w beeing a velocity?

PebetoUofC commented 2 years ago

With a first-order system, it would seem more natural to assume noise in the velocity, as you suggested (with w being a velocity). However, when looking at the model, I do not see mathematically why we could not use a model for the noise with w as acceleration. Note that in this case units remain consistent as acceleration * time = velocity

heyicheng-never commented 1 year ago

@PebetoUofC Suppose we use a model for the noise with w as acceleration. But why is the covariance of the process noise related to the variance of v and not to a?

raphaelreme commented 5 months ago

It is indeed weird: it uses a different hypothesis for constant acceleration or jerk models than for the constant velocity model.

TLDR: The hypotheses differ and this should not be the case. If I have some time, I'll try to submit a PR to fix this in filterpy (and maybe for the book too ?)

For the constant velocity hypothesis. It assumes that the acceleration is 0 up to some noise during the discretize interval: $\forall t \in [tk, t{k+1}], a(t) = a_k$ where $a_k \sim \mathcal{N}(0, \sigma^2)$. By integration, we have: $\forall h \in [0, dt], v(t_k + h) = v_k + h a_k$ and $x(t_k + h) = x_k + h v_k + \frac{h^2}{2} a_k$.

By computing the expectations, variances and covariances (fixing all random variables except $a_k$) at $h = dt$ we indeed find:

F = \begin{pmatrix}1 & dt\\ 0 & 1\end{pmatrix} \text{ and } Q = \sigma^2\begin{pmatrix}\frac{dt^4}{4} & \frac{dt^3}{2}\\ \frac{dt^3}{2} & dt^2\end{pmatrix}

If we followed the same logic for the constant acceleration we would assume a jerk at 0 up to some noise during the discretize interval: $\forall t \in [tk, t{k+1}], j(t) = j_k$ where $j_k \sim \mathcal{N}(0, \sigma^2)$. By integration, we have: $\forall h \in [0, dt], a(t_k + h) = a_k + h j_k$, $v(t_k + h) = v_k + h a_k + \frac{h^2}{2} j_k$ and $x(t_k + h) = x_k + h v_k + \frac{h^2}{2} a_k + \frac{h^3}{6} j_k$.

Again by computing the expectations, variances and covariances (fixing all random variables except $j_k$) at $h = dt$ we find:

F = \begin{pmatrix}1 & dt & \frac{dt^2}{2} \\ 0 & 1 & dt \\ 0 & 0 & 1\end{pmatrix} \text{ and } Q = \sigma^2\begin{pmatrix}\frac{dt^6}{36} & \frac{dt^5}{12} & \frac{dt^4}{6}\\ 
\frac{dt^5}{12} & \frac{dt^4}{4}   & \frac{dt^3}{2}\\
\frac{dt^4}{6}   & \frac{dt^3}{2}   & dt^2                 \\\end{pmatrix}

This is not what is done in filterpy nor in the book for the constant acceleration/jerk.

For the constant acceleration/jerk, it rather assumes that the acceleration is constant on the time interval with a slight variation from the previous time interval: $\forall t \in [tk, t{k+1}], a(t) = a_k + w_k$ where $w_k \sim \mathcal{N}(0, \sigma^2)$. By integration, we have: $\forall h \in [0, dt], v(t_k + h) = v_k + h a_k + h w_k$ and $x(t_k + h) = x_k + h v_k + \frac{h^2}{2} a_k+ \frac{h^2}{2} w_k$.

By computing the expectations, variances and covariances (fixing all random variables except $w_k$) at $h = dt$ we indeed find:

F = \begin{pmatrix}1 & dt & \frac{dt^2}{2} \\ 0 & 1 & dt \\ 0 & 0 & 1\end{pmatrix} \text{ and } Q = \sigma^2\begin{pmatrix}\frac{dt^4}{4} & \frac{dt^3}{2}& \frac{dt^2}{2}\\ 
\frac{dt^3}{2} & dt^2                & dt                    \\
\frac{dt^2}{2} & dt                    & 1                      \end{pmatrix}

If we applied the same logic for the constant velocity model, we would assume that the velocity is constant over the time interval: $\forall t \in [tk, t{k+1}], v(t) = v_k + w_k$ where $w_k \sim \mathcal{N}(0, \sigma^2)$. By integration, we have: $\forall h \in [0, dt], x(t_k + h) = x_k + h v_k + h w_k$.

Again by computing the expectations, variances and covariances (fixing all random variables except $w_k$) at $h = dt$ we find:

F = \begin{pmatrix}1 & dt \\ 0 & 1 \end{pmatrix} \text{ and } Q = \sigma^2\begin{pmatrix}
dt^2                & dt                    \\
dt                    & 1                      \end{pmatrix}

And this is different from the first hypothesis which is used for the constant velocity model.

From my point of view, both are valid models. The second one is a bit simpler and truly assumes a constant velocity/acceleration. The first one models a zero-mean acceleration (resp. jerk) for the constant velocity (resp. acceleration) and therefore the velocity (resp. acceleration) is not truly assumed constant (only in expectation). I'm not sure which is best, but filterpy should probably be consistent and choose one (or let the user choose by having both implemented).

rlabbe / Kalman-and-Bayesian-Filters-in-Python

Chapter 7: Piecewise White Noise Model for first order system #394