Chethan-Babu-stack / Machine-Learning-for-Evolving-graph-data

MIT License
0 stars 0 forks source link

Baseline model selection for multi-variate time series data #3

Open Chethan-Babu-stack opened 3 years ago

Chethan-Babu-stack commented 3 years ago

Dataframe columns are "Origin_Destination" with rows indexed using "days". Apparently, there are nearly "35000" edges(features in case of multivariate time series data). ARIMA is a univariate time series model. VARMA is multivariate time series model. I'm planning to use ARIMA and VARMA as baseline models.

As ARIMA is for univariate time series data, I would iterate through one column once and train, test the data. But these edges are not completely independent. If there exists any correlation(causality: one leading to other) between the two edges, it will be not captured.

For instance, flights between Bangalore(India) and Frankfurt(Germany) would have transit in Muscat usually. This will be not captured in ARIMA. Is it fine still to use ARIMA?

Currently using VAR and VARMA models.

j-petit commented 3 years ago

Yes, ARIMA is still fine. This is a limitation of the model. We can therefore show how more complex models improve the accuracy of predictions.