h2oai / h2o4gpu

H2Oai GPU Edition
Apache License 2.0
456 stars 96 forks source link

ARIMA n-step regression #797

Closed sh1ng closed 4 years ago

sh1ng commented 4 years ago

Implementation of ARIMA model. It contains CUDA part only and a few more changes are upcoming, but I believe it's good enough to be reviewed. After python part is done I'm going to add a few more methods to evaluate coefficients or a c++ version without CUDA.

Some theory can be found in:

sh1ng commented 4 years ago

There's instability in convergence probably due to residual approximation. I think it can be fixed by implementing IIR filter(residual computation can be expressed as an IIR filter).

Most suitable approach that I've found https://userweb.cs.txstate.edu/~burtscher/papers/asplos18.pdf

sh1ng commented 4 years ago

How it works:

  1. Regression to find AR coefficients
  2. Evaluate residuals(without MA part)
  3. Regression to find AR and MA coefficients
  4. Correct residuals(by using AR's coefficients, MA's coefficients and residuals from the previous steep)
  5. go to step 3.

As you can see computation on step 4. is not precise, but

Precise implementation on GPU(mentioned above) is quite interesting, but inapplicable to the task as it requires knowing all coefficients in advance to compute n-nachi numbers.
Classical algorithm(section 1.4 of https://www.cs.cmu.edu/~guyb/papers/Ble93.pdf) is too complicated.

And I've noticed that precise computation of residual makes convergence much more unstable(I've implemented it on CPU). I guess it happens because errors get accumulated from iteration to iteration(when move from t-1 time to t).

It's good enough to merged it into master and work on another methods in separate PRs.