winedarksea / AutoTS

Automated Time Series Forecasting
MIT License
1.05k stars 96 forks source link
automl autots deep-learning feature-engineering forecasting machine-learning preprocessing time-series

AutoTS

AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale.

In 2023, AutoTS won in the M6 forecasting competition, delivering the highest performance investment decisions across 12 months of stock market forecasting.

There are dozens of forecasting models usable in the sklearn style of .fit() and .predict(). These includes naive, statistical, machine learning, and deep learning models. Additionally, there are over 30 time series specific transforms usable in the sklearn style of .fit(), .transform() and .inverse_transform(). All of these function directly on Pandas Dataframes, without the need for conversion to proprietary objects.

All models support forecasting multivariate (multiple time series) outputs and also support probabilistic (upper/lower bound) forecasts. Most models can readily scale to tens and even hundreds of thousands of input series. Many models also support passing in user-defined exogenous regressors.

These models are all designed for integration in an AutoML feature search which automatically finds the best models, preprocessing, and ensembling for a given dataset through genetic algorithms.

Horizontal and mosaic style ensembles are the flagship ensembling types, allowing each series to receive the most accurate possible models while still maintaining scalability.

A combination of metrics and cross-validation options, the ability to apply subsets and weighting, regressor generation tools, simulation forecasting mode, event risk forecasting, live datasets, template import and export, plotting, and a collection of data shaping parameters round out the available feature set.

Table of Contents

Installation

pip install autots

This includes dependencies for basic models, but additonal packages are required for some models and methods.

Be advised there are several other projects that have chosen similar names, so make sure you are on the right AutoTS code, papers, and documentation.

Basic Use

Input data for AutoTS is expected to come in either a long or a wide format:

Lower-level functions are only designed for wide style data.

# also load: _hourly, _monthly, _weekly, _yearly, or _live_daily
from autots import AutoTS, load_daily

# sample datasets can be used in either of the long or wide import shapes
long = False
df = load_daily(long=long)

model = AutoTS(
    forecast_length=21,
    frequency='infer',
    prediction_interval=0.9,
    ensemble='auto',
    model_list="fast",  # "superfast", "default", "fast_parallel"
    transformer_list="fast",  # "superfast",
    drop_most_recent=1,
    max_generations=4,
    num_validations=2,
    validation_method="backwards"
)
model = model.fit(
    df,
    date_col='datetime' if long else None,
    value_col='value' if long else None,
    id_col='series_id' if long else None,
)

prediction = model.predict()
# plot a sample
prediction.plot(model.df_wide_numeric,
                series=model.df_wide_numeric.columns[0],
                start_date="2019-01-01")
# Print the details of the best model
print(model)

# point forecasts dataframe
forecasts_df = prediction.forecast
# upper and lower forecasts
forecasts_up, forecasts_low = prediction.upper_forecast, prediction.lower_forecast

# accuracy of all tried model results
model_results = model.results()
# and aggregated from cross validation
validation_results = model.results("validation")

The lower-level API, in particular the large section of time series transformers in the scikit-learn style, can also be utilized independently from the AutoML framework.

Check out extended_tutorial.md for a more detailed guide to features.

Also take a look at the production_example.py

Tips for Speed and Large Data:

How to Contribute:

Also known as Project CATS (Catlin's Automated Time Series) hence the logo.