antoinecarme / pyaf

PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.
BSD 3-Clause "New" or "Revised" License
458 stars 73 forks source link

Pyaf 5.0 Final Touch 5 : Add more info about Exogenous Data Used in ARX Models #236

Closed antoinecarme closed 1 year ago

antoinecarme commented 1 year ago

We need more detail about exogenous data usage in ARX models in the logs. The same applies for XGBX, LGBX, LSTMX, MLPX, etc

This is just a reporting task, no need to change the underlying models,

Used categorical variables, categories used/excluded, their lags, their coefficient

Used continuous/numerical variables, their means, stddevs etc, their lags, their coefficients.

List of excluded variables/categories.

Something like :

INFO:pyaf.std:AR_MODEL_DETAIL_START
INFO:pyaf.std:AR_MODEL_COEFF 1 Exog3=AQ_Lag48 -5351895601474.041
INFO:pyaf.std:AR_MODEL_COEFF 2 Exog2_Lag37 1671985441354.5808
INFO:pyaf.std:AR_MODEL_COEFF 3 Exog2_Lag48 -1671985441354.5703
INFO:pyaf.std:AR_MODEL_COEFF 4 Exog3=AQ_Lag37 486536018129.6282
INFO:pyaf.std:AR_MODEL_COEFF 5 Exog3=AQ_Lag45 486536018129.53394
INFO:pyaf.std:AR_MODEL_COEFF 6 Exog3=AQ_Lag47 486536018129.2731
INFO:pyaf.std:AR_MODEL_COEFF 7 Exog3=AQ_Lag39 486536018129.2544
INFO:pyaf.std:AR_MODEL_COEFF 8 Exog3=AQ_Lag41 486535968300.5181
INFO:pyaf.std:AR_MODEL_COEFF 9 Exog3=AQ_Lag43 486535968300.41895
INFO:pyaf.std:AR_MODEL_COEFF 10 Exog3=AQ_Lag42 486535968300.296
INFO:pyaf.std:AR_MODEL_DETAIL_END
INFO:pyaf.std:EXOGENOUS_VARIABLE_DETAIL_START used=['Exog4', 'Exog2', 'Exog3']
INFO:pyaf.std:EXOGENOUS_VARIABLE_DETAIL_CATEGORICAL 'Exog3' ['AQ', 'AR', 'AS', 'AT', 'AU'] ['Exog3=AQ']
INFO:pyaf.std:EXOGENOUS_VARIABLE_DETAIL_CATEGORICAL 'Exog4' ['P_T', 'P_R', 'P_U', 'P_S', 'P_Q'] ['Exog4=P_R']
INFO:pyaf.std:EXOGENOUS_VARIABLE_DETAIL_CATEGORICAL_EXCLUDED 0 []
INFO:pyaf.std:EXOGENOUS_VARIABLE_DETAIL_CONTINUOUS 'Exog2' {'Mean': 6.411764705882353, 'StdDev': 3.4365094970361736}
INFO:pyaf.std:EXOGENOUS_VARIABLE_DETAIL_CONTINUOUS_EXCLUDED 0 []
INFO:pyaf.std:EXOGENOUS_VARIABLE_DETAIL_END
INFO:pyaf.std:TRAINING_TIME_IN_SECONDS 14.89
antoinecarme commented 1 year ago

The cForecastEngine.to_dict() call will produce something like this :

        "Dataset": {
            "Exogenous_Data": {
                "Categorical_Variables": {
                    "Exog3": [
                        "AQ",
                        "AR",
                        "AS",
                        "AT",
                        "AU"
                    ],
                    "Exog4": [
                        "P_T",
                        "P_R",
                        "P_U",
                        "P_S",
                        "P_Q"
                    ]
                },
                "Continuous_Variables": {
                    "Exog2": {
                        "Mean": 6.411764705882353,
                        "StdDev": 3.4365094970361736
                    }
                }
            },
            "Signal": "Ozone",
            "Time": {
                "Horizon": 12,
                "TimeDelta": "<DateOffset: months=1>",
                "TimeMax": "1971-12-01 00:00:00",
                "TimeMin": "1955-01-01 00:00:00",
                "TimeVariable": "Time"
            },
            "Training_Signal_Length": 204
        },
antoinecarme commented 1 year ago

Hierarchical forecasting models can have exogenous data at each node or at the whole tree level.

This task will produce a multitude of reports about each exogenous data usage.