intive-DataScience / tbats

BATS and TBATS forecasting methods
MIT License
178 stars 19 forks source link

Fastest configuration #23

Open mloning opened 3 years ago

mloning commented 3 years ago

Hi and thanks for working on this!

What's the fastest configuration for BATS and TBATS? We're running a number of basic tests and need to speed them up a bit.

We currently set "use_box_cox": False, "use_trend": False, "use_damped_trend": False, "sp": [1], "use_arma_errors": False. Anything we can do to make it faster? It's still an order of magnitude slower than AutoARIMA for example. Is that normal?

cotterpl commented 3 years ago

I think you can do a little quicker with seasonal_periods=[] but then it will not go through paths that fit seasonalities.

The shorter time series you provide the quicker it will run.

You can make it blazing fast if you provide time series that is constant (for example: y=[1,1,1,...,1]) but I am not sure if this is what you want as in such a case it exits just after a few lines of code providing constant model.

The other, quite complex option (not something that is officially supported) is to provide context to TBATS constructor. Context is a simple implementation of dependency inversion. One of the slowest parts is ParamsOptimizer that could be mocked or modified in order to return found optimum just after a few iterations.

Otherwiise, without modification of the package code, it is seems not possible to make it quicker.

mloning commented 3 years ago

@cotterpl thanks for the reply! We're currently running BATS(seasonal_periods=[], use_arma_errors=False, use_box_cox=False, use_trend=False, use_damped_trend=False) and that still takes a few seconds.

In these tests we're not checking for numeric results but just want to make sure the API is followed properly (input/output types, etc). We're already using very small amounts of data.

What would the context object look like to make it as fast as possible?

aiwalter commented 3 years ago

thx @cotterpl , would also be interesting to know which one, either BATS or TBATS, is the fastest way to run a model with tbats?

mloning commented 3 years ago

@aiwalter just tried it and BATS seems to be a bit faster (3 vs 1.5 seconds on average on my machine).

cotterpl commented 3 years ago

You may also try setting n_jobs=1. This will turn off thread spanning and may be quicker on short time series.

For the context: Inherit after default one and override method that provides ParamsOptimizer. See ParamsOptimizer for details that need to be overriden.

mloning commented 3 years ago

@cotterpl sorry I can't really follow how to set up the context. Could you point me to the docs or paste a code snippet? That would be much appreciated!

mloning commented 3 years ago

Setting n_jobs=1 pushes down the run time to 100ms for BATS - so that may be enough already. Why isn't that the default?

cotterpl commented 3 years ago

It is not the default because your time series are usually not short and your settings for parameters are usually not to turn all of them off and in such a situation it is quicker to do multiprocessing.