Closed rspadim closed 4 years ago
Hi! Thank you for your interest in Simdkalman
At what point did you get stuck when you tried to migrate this code? The example code seems to contain all elements used in the above snippet
Also what is your primary motivation of porting this code to Simdkalman? Is it too slow or would you like to analyse multiple similar-sized time series in a vectorized manner?
As a clarification to my comment in the PyKalman issue tracker https://github.com/pykalman/pykalman/issues/86: I do not think it is necessary for the users of PyKalman to change their Kalman filter library if PyKalman is working fine. However, I do think that library developers should avoid wasting their time submitting PRs or issues to PyKalman as the maintainer is not reacting to those anymore.
However, I agree that a short migration guide would be a good addition to the READMEs
Hi @oseiskar ! the code isn't so fast (pykalman), but "fast" without benchmark isn't a good metric (that's the motivation)
i get stuck at em() function, i didn't started the implementation yet
thanks about the example code, i will try to use it now
Sorry, I missed the questions in your original message (at a quick glance it seemed that the code snippet was just copied as-is from the paper, which I wasn't). The were not trivial either, let me try to answer:
my doubt... should the H be = [[h,0]] ? considering https://github.com/oseiskar/simdkalman/issues/11
The correct value for H is seems to be np.eye(3, 1)
, which is [[1,0,0]]
, this is a hidden and undocumented feature in PyKalman (and not an obvious default in my opinion). It's always better to define the observation matrix explicitly, if not for other purposes, then readability / documentation.
kf=kf.em(z) # could this be done by smidkalman?
Yes, Simdkalman actually has exactly the same API (not documented currently, unfortunately): https://github.com/oseiskar/simdkalman/blob/67e538a5981aa4a2dc323028167913bf5f676383/examples/example.py#L27
which does the same thing, that is, apply the EM algorithm on these variables: https://github.com/pykalman/pykalman/blob/8d3f8e498b64d902016a0216bf2bcc8b262d917b/pykalman/standard.py#L1495-L1498 (these defaults are also undocumented in PyKalman)
x_mean,x_covar=kf.smooth(z) # what about this one?
Yes, this is also done in the example https://github.com/oseiskar/simdkalman/blob/67e538a5981aa4a2dc323028167913bf5f676383/examples/example.py#L30 . You can access the mean as smoothed.mean
and the covariance as smoothed.cov
.
The forecasting feature (step 5 in your code) can be done using the "predict" method: https://github.com/oseiskar/simdkalman/blob/67e538a5981aa4a2dc323028167913bf5f676383/examples/example.py#L32
looks promissing, from 33 seconds to 11 seconds (3x)
i'm not sure yet about covar output and multi variable output (from pykalman there's a space/velocity/aceleration output variables , from simdkalman i'm not sure if we have it)
first translation (not tested covar/std/predict yet)
# by MLdP on 02/22/2014 <lopezdeprado@lbl.gov>
# Kinetic Component Analysis
import numpy as np
from pykalman import KalmanFilter
import simdkalman
#-------------------------------------------------------------------------------
def fitKCA(t,z,q,fwd=0):
'''
Inputs:
t: Iterable with time indices
z: Iterable with measurements
q: Scalar that multiplies the seed states covariance
fwd: number of steps to forecast (optional, default=0)
Output:
x[0]: smoothed state means of position velocity and acceleration
x[1]: smoothed state covar of position velocity and acceleration
Dependencies: numpy, pykalman
'''
#1) Set up matrices A,H and a seed for Q
h=(t[-1]-t[0])/t.shape[0]
A=np.array([[1,h,.5*h**2],
[0,1,h],
[0,0,1]])
Q=q*np.eye(A.shape[0])
#2) Apply the filter
kf=KalmanFilter(transition_matrices=A,transition_covariance=Q)
#3) EM estimates
kf=kf.em(z)
#4) Smooth
x_mean,x_covar=kf.smooth(z)
#5) Forecast
for fwd_ in range(fwd):
x_mean_,x_covar_=kf.filter_update(filtered_state_mean=x_mean[-1], \
filtered_state_covariance=x_covar[-1])
x_mean=np.append(x_mean,x_mean_.reshape(1,-1),axis=0)
x_covar_=np.expand_dims(x_covar_,axis=0)
x_covar=np.append(x_covar,x_covar_,axis=0)
#6) Std series
x_std=(x_covar[:,0,0]**.5).reshape(-1,1)
for i in range(1,x_covar.shape[1]):
x_std_=x_covar[:,i,i]**.5
x_std=np.append(x_std,x_std_.reshape(-1,1),axis=1)
print(x_std)
return x_mean,x_std,x_covar
# TODO: change to simdkalman
# https://github.com/oseiskar/simdkalman/blob/master/examples/example.py
def fitKCA_simdkalman(t,z,q,fwd=0):
'''
Inputs:
t: Iterable with time indices
z: Iterable with measurements
q: Scalar that multiplies the seed states covariance
fwd: number of steps to forecast (optional, default=0)
Output:
x[0]: smoothed state means of position velocity and acceleration
x[1]: smoothed state covar of position velocity and acceleration
Dependencies: numpy, pykalman
'''
#1) Set up matrices A,H and a seed for Q
h=(t[-1]-t[0])/t.shape[0]
A=np.array([[1,h,.5*h**2],
[0,1,h],
[0,0,1]])
Q=q*np.eye(A.shape[0])
#2) Apply the filter
#kf=KalmanFilter(transition_matrices=A,transition_covariance=Q)
kf = simdkalman.KalmanFilter(
state_transition = A, # <--- this is the matrix A
process_noise = Q, # Q
observation_model = np.array([[1,0,0]]), # H
observation_noise = q
)
#3) EM estimates
kf=kf.em(z)
#4) Smooth
smoothed = kf.smooth(z)
x_mean, x_covar = smoothed.observations.mean, smoothed.observations.cov
x_std = None
#5) Forecast
if False:
pred = kf.predict(data, fwd)
print(pred)
x_mean = np.append(x_mean, pred.mean)
x_covar = np.append(x_covar, pred.cov)
#for fwd_ in range(fwd):
# x_mean_,x_covar_=kf.filter_update(filtered_state_mean=x_mean[-1], \
# filtered_state_covariance=x_covar[-1])
# x_mean=np.append(x_mean,x_mean_.reshape(1,-1),axis=0)
# x_covar_=np.expand_dims(x_covar_,axis=0)
# x_covar=np.append(x_covar,x_covar_,axis=0)
#6) Std series
if False:
x_std=(x_covar**.5).reshape(-1,1)
#for i in range(1,x_covar.shape[1]):
# x_std_=x_covar[:,i,i]**.5
# x_std=np.append(x_std,x_std_.reshape(-1,1),axis=1)
print(x_std)
return x_mean,x_std,x_covar
from pykalman i got 3 outputs , but with simd it only output one variable (space)
plt.plot(ret_pykalman[0][500:][:,0]) # pykalman - check [:,0] return space, [:,1] return velocity, [:,2] return aceleration plt.show() plt.plot(ret_simdkalman[0][500:]) # simdkalman, only output space plt.show()
The hidden state variables are available too: Just access the states
field instead of observations
, e.g., smoothed.observations.mean
-> smoothed.states.mean
(see also the docs)
Uhm I’m a bit confused
Using pykalman the system is a cinetic equation (space = velocity time + .5 aceleration ** 2) that return 3 measures (space, velocity, aceleration) I can acess it with [:,0] [:,1] [:,2] with pykalman, but with simdkalman i’m trying to understand how to do something like it
Any help is wellcome, the space variable is working
If i understood it right, pykalman execute the system and create multi timeseries output, simdkalman have other output (maybe i should input velocity and aceleration as time srries?)
See my previous message: Simdkalman outputs both:
x
) as smoothed.states.mean
y = H x
as smoothed.observations.mean
You now use smoothed.observations
. Just change to smoothed.states
if you want the 3-dimensional hidden states. The covariances for both things are also available (e.g. smoothed.observations.cov
vs smoothed.states.cov
)
Ok, i think it’s just part of “porting pykalman behaviour”
I see there’s 1 observation, at pykalman there’s 3 observations Maybe a good point to explain is how to get back to states and observations and recreate these “others observations”
Technically, PyKalman always outputs the hidden states (x
) in smoothing. The smoothed observations (produced from the states by multiplying with the observation matrix: y = Hx
) are a convenience feature in Simdkalman only. If you just always use smoothed.states.mean
and smoothed.states.cov
(and forget about the smoothed observations), the behavior is identical to PyKalman.
Observations and (hidden) states are two very different objects in Kalman filters and it's always good to be explicit about which ones you deal with ... and always explicitly define the matrix H
, which transforms hidden states to observations.
Nice @oseiskar !
I think this's something that a programmer (not a kalman user) will question about, maybe an example is something interesting at documentation
Just to report what's the difference about pykalman and simdkalman outputs, and what should do to have the same behavior
The main issue seems to be resolved
Hello!
I want to understand how to port a pykalman code to simdkalman, could anyone help me?
it's a implementation of kalman to understand the kinetic (position+velocity+acceleration) movement (KCA) from lopez prado (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2422183)