Tests and consolidation of VAR options

adam2392 commented 3 years ago

Right now, we need to consolidate options for VAR modeling:

l2 regularization
adding trends

Moreover, if we take code from statsmodels for the vector AR estimating process, we might as well take their select_order function to allow users to also include a "select lag order" of their VAR model using information criterion.

All this functionality should be tested either:

against statsmodels implementation
or a manual solution (e.g. for l2 regularization, just solve a 2x2 system with pinv

adam2392 commented 3 years ago

Another related issue is how to store VAR models with lags that occur in multiple epochs. So we pass in EEG data, epoched (maybe these are contiguous windows over time made with make_fixed_length_epochs), then the data will be (n_epochs, n_chs, n_samples).

If we just worry about storing VAR(1) model, or a traditional linear dynamical system, then we only get an EpochConnectivity container in return that stores (n_epochs, n_chs, n_chs).

If we now get a VAR(p) model, we get data that looks like: (n_epochs, lags, n_chs, n_chs) (how it is stored in statsmodels minus the epochs dimension), or (n_epochs, n_chs * lags, n_chs) (current implementation). If we account for an additional "lags" dimension, not sure if "EpochConnectivity" is the best container anymore.

Should I explore a hack to see if we can still use EpochConnectivity, but handle special cases, where lags is defined?, or should I consider creating a new say... EpochVARConnectivity container just for VAR models?

cc: @larsoner @agramfort @britta-wstnr

larsoner commented 3 years ago

If we now get a VAR(p) model, we get data that looks like: (n_epochs, lags, n_chs, n_chs) (how it is stored in statsmodels minus the epochs dimension), or (n_epochs, n_chs * lags, n_chs) (current implementation). If we account for an additional "lags" dimension, not sure if "EpochConnectivity" is the best container anymore.

Wouldn't it just be EpochTemporalConnectivity at that point? The lags are really time, right?

adam2392 commented 3 years ago

Hmmm, yeah we could represent them as "times" with the proper documentation. Although if you have a sliding VAR(p), then we just need to be careful of how we explain this.

Epochs are actual time points at which the VAR occurs from the original time series
Times are the time points wrt lag of the VAR model

adam2392 commented 3 years ago

I think the only remaining question I have is: should we handle "trends"? This causes issues with the shape of the data structure stored.

I don't think so, because we aren't modeling some traditional time series? However, if ppl feel strongly, I can figure out how to make it work.

Reference: https://www.statsmodels.org/stable/generated/statsmodels.tsa.vector_ar.var_model.VAR.fit.html#statsmodels.tsa.vector_ar.var_model.VAR.fit

britta-wstnr commented 3 years ago

I agree with @larsoner that we can just use EpochTemporalConnectivity for this (and document, maybe also have a good example). Regarding trends: I would wait and see if anyone ever needs it, you are right that it is probably not used a lot with neuroscientific data.

mne-tools / mne-connectivity

Tests and consolidation of VAR options #45