feature request: state-space models

koaning / scikit-lego

Extra blocks for scikit-learn pipelines.

https://koaning.github.io/scikit-lego/

MIT License

1.26k stars 116 forks source link

feature request: state-space models #13

Closed dvanamstel closed 5 years ago

dvanamstel commented 5 years ago

Add state-space models, in discrete form:

x(k+1) = A * x(k) + B * u(k)
y(k) = C * x(k) + D * u(k)

in where: x(k) - internal state vector at timestamp k u(k) - input vector at timestamp k y(k) - output at timestamp k

Initial implementation would be with a given size of state vector x (e.g. you know the dimension of the underlying system). Second iteration could also estimate the length of this vector x, but that's prob not doable in a single day.

Must admit: I haven't seen many use-cases that would be best solved using a state-space model and thus wonder how useful this can be. Also, I haven't seen many use-cases in general.

koaning commented 5 years ago

A few observations.

This assumes that the dataset X, y needs to be sorted and needs to represent a single timeseries? Is this the intended behavior?
What is reasonable for k=0?
How many steps ahead might be reasonable to predict?
Can you come up with a use-case?

dvanamstel commented 5 years ago

Indeed, X and y have to be sorted time series.
Not sure if I understand the question correctly? For k=0 you insert the first element in your time-series, e.g. the "initial condition".
In theory; infinite. In practice: depends on how well the fit is.

Compared to a linear model (Note that if A, B & C are 0 the formulation reduces to a "standard" linear model y = D * u) it should perform better as it learns time-domain dynamics. I wouldn't be surprised if you can achieve the same result by adding delayed features...but that is manual work ;).

Marketing efforts to sales predictions
Production line (or any system that has "B can start after A is finished" types of rules) disturbance analysis.

koaning commented 5 years ago

in a way, it sounds as if you are going to implement an RNN.

koaning commented 5 years ago

i might be open to something like this being a part of scikit-lego via a dependency. but i kind of want to be careful with how many dependencies we add to the project.

thoughts, @MBrouns ?

dvanamstel commented 5 years ago

I wasn't planning on implementing a RNN. That would be equivalent to using a neural network to implement a linear model: possible but a bit overkill. :)

koaning commented 5 years ago

something tells me that this will be really hard to implement if you consider the train/test aspect of it. ill close this issue for now (but feel free to challenge me on this)