Independent component analysis (ICA) for the data

kajal5888 commented 2 years ago

Is your feature request related to a problem? Please elaborate. Independent Component Analysis (ICA) – have been steadily gaining popularity in the last years as viable techniques to preprocess multivariate data sets, to disentangle information linearly mixed by volume conduction in the recorded data channels, and to perform or prepare for more general data mining.

dfsp-spirit commented 1 year ago

Development info:

the MNE function is mne.preprocessing.ICA. They offer 3 different methods: fastica from sklearn.decomposition.FastICA, picard from picard, and 2 variants of infomax (their own implementation).
fieldtrip tutorials on ICA can be found at example/ica_ecg and example/ica_eog. Their default ICA method, invoked by running ft_componentanalysis(cfg, data) with cfg.method='runica', uses the implementation from EEGLAB repo. They vendor the EEGLAB implementation, see external/eeglab/runica.m in their repo. The fieldtrip docs suggest running the ft_componentanalysis function that performs the ICA only after running artifact rejection, pre-processing, and downsampling.
EEGLAB provides various algorithms, some, like fastica and picard, via external toolboxes. The EEGLAB tutorial on ICA lists many options, in section "Which ICA Algorithm? ".
the most popular ICA algo seems to be fastica, with packages available for all commonly used scientific programming languages (C++, R, Matlab, Python, ...)
the WIP feature branch for this is 328-ica

dfsp-spirit commented 1 year ago

@kajal5888 I guess you intend to get results similar to those in Fieldtrip? Apparently they support 2 different algorithms, one of them can be used with or without PCA. Quoting from their documentation of runica:

Perform Independent Component Analysis (ICA) decomposition
%            of input data using the logistic infomax ICA algorithm of 
%            Bell & Sejnowski (1995) with the natural gradient feature 
%            of Amari, Cichocki & Yang, or optionally the extended-ICA 
%            algorithm of Lee, Girolami & Sejnowski, with optional PCA 
%            dimension reduction.

I am unsure whether we will implement all these options right away, so could you let me know what you your call to runica looks like? E.g., the most simple case with the default algorithm would be:

[weights,sphere] = runica(data);

UPDATE: After some research, it seems that the amount of data (trial length) available to ICA is the main factor that determines whether or not people use the 'pca' flag. This explanation can be found in the EEGLAB ICA documentation, quote:

We usually run ICA using many more trials than the sample decomposition
 presented here. ICA works best when given a large amount of basically similar
 and mostly clean data. When the number of channels (N) is large (>>32), then 
a considerable amount of data may be required to find N components. When
 insufficient data are available, then use the ‘pca’ option to find fewer than N 
components may be the only good option. In general, it is important to give 
ICA as much data as possible for successful training.

More information can be found in the MNE docs:

The n_components parameter determines how many components out of the n_channels PCA components the ICA algorithm will actually fit. This is not typically used for EEG data, but for MEG data, it’s common to use n_components < n_channels. For example, full-rank 306-channel MEG data might use n_components=40 to find (and later exclude) only large, dominating artifacts in the data, but still reconstruct the data using all 306 PCA components. Setting n_pca_components=40, on the other hand, would actually reduce the rank of the reconstructed data to 40, which is typically undesirable.

If you are migrating from EEGLAB and intend to reduce dimensionality via PCA, similarly to EEGLAB’s runica(..., 'pca', n) functionality, pass n_components=n during initialization and then n_pca_components=n during [apply()](https://mne.tools/stable/generated/mne.preprocessing.ICA.html#mne.preprocessing.ICA.apply). The resulting reconstructed data after [apply()](https://mne.tools/stable/generated/mne.preprocessing.ICA.html#mne.preprocessing.ICA.apply) will have rank n.

dfsp-spirit commented 1 year ago

Some interesting papers and articles related to ICA and MEG/EEG data:

dfsp-spirit commented 1 year ago

We decided to provide the FastICA method for know, using the implementation from scikit-learn, in a new user frontend function called runica, as in FT. This will add scikit-learn as a dependency of Syncopy, but that is absolutely fine with us, as almost anybody is most likely using that anyways.

See also ica sklearn example

dfsp-spirit commented 1 year ago

We could have a look at this FT example dataset linked here for the unit tests.

dfsp-spirit commented 1 year ago

A great overview of the types of noise to expect can be seen at 5:52 in this youtube ICA video by the EEGLAB team
This paper explains artificial EEG data generation including artifacts in Section 5.1, including methods of simulating them (see also its references 2,44,45). Eye artifacts are modeled with the sinc function.
This paper also discusses "Constructing synthetic EEG and eye-blink signals" by extracting eye blinks from real data, see start of Results section

Plotting Before ICA, ppl want to identify artifacts (to decide whether or not to actually run ICA). After ICA, ppl typically want to plot channels. See https://mne.tools/dev/auto_tutorials/intro/40_sensor_locations.html# for info on sensor locations.

dfsp-spirit commented 1 year ago

ICA requires a lot more work than just the implementation of an ICA algorithm, e.g., topoplots so users can identify the components they want to filter from their data (heartbeats, eye blinks, etc). This means we would have to invest lots of time into interactive plotting, which we do not want to do at this time. Furthermore, MNE has great interactive viewers for these, and our group/users do hardly use ICA, we decided not to implement ICA in Syncopy for now.

Instead, we will make it easier to switch back and forth between Syncopy and MNE, as described in issue #491. This will allow users to perform ICA in MNE, then continue in Syncopy.

esi-neuroscience / syncopy

Independent component analysis (ICA) for the data #328