mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.72k stars 1.32k forks source link

How to obtain ICA weights for each component? #2404

Closed ImmanuelSamuel closed 8 years ago

ImmanuelSamuel commented 9 years ago

How to get the channel weights matrix for each ICA component? I can get the weights for the PCA components but not the ICA components. Coming from EEGLAB where specifying number of pca components gives me same number of ica components, this is different. Any docs or papers that can help me understand this? Thanks

jona-sassenhagen commented 9 years ago

Try this function

def _get_ica_map(ica, components=None):
    """Get ICA topomap for components"""
    fast_dot = np.dot
    if components is None:
        components = list(range(ica.n_components_))
    maps = fast_dot(ica.mixing_matrix_[:, components].T,
                    ica.pca_components_[:ica.n_components_])
    return maps

which will return the mixing matrix.

ImmanuelSamuel commented 9 years ago

Thanks

ImmanuelSamuel commented 8 years ago

Seems as thought the maps and components after getting the source doesn't match. Am I doing something wrong? The last topo seems to be the eye blink but after getting the sources seems as though the fourth spectrum is the eyeblink (large power no alpha)

image

jona-sassenhagen commented 8 years ago

Can you post a fully replicable example, that is, one using e.g. sample data?

BTW _get_ica_maps is now in mne.preprocessing.ica._get_ica_maps

jona-sassenhagen commented 8 years ago

Oh and the easiest way to plot topomaps is mne.viz.plot_topomap :)

ImmanuelSamuel commented 8 years ago

Data archive.zip

ImmanuelSamuel commented 8 years ago
# create montage - read data from SFP format
montage = mne.channels.read_montage('chanloc', ch_names=None, path=chanpath, unit='mm', transform=False)
chanNames= montage.ch_names
chNo = len(chanNames)
montage.ch_names = [str(each) for each in np.arange(chNo)]

# create info for mne epochs
info = mne.create_info([str(each) for each in np.arange(chNo)], 250, ch_types='eeg', montage=montage)

events = np.load('events')
data = np.load('data')
epochs = mne.EpochsArray(data, info, events, tmin=-0.5, event_id=None, 
                         reject=None, flat=None, reject_tmin=None, reject_tmax=None, 
                         baseline=baseline, verbose=False)

ica = mne.preprocessing.ica.ICA(n_components=0.99, method='fastica').fit(epochs)

fig, axes = plt.subplots(nrows=2, ncols=9, figsize=(20,7))
maps = np.transpose(_get_ica_map(ica))
infoTemp = mne.create_info([str(each) for each in np.arange(81)], 1, ch_types='eeg', montage=montage)
evoked = mne.EvokedArray(maps, infoTemp, 0, comment='', nave=1, kind='average', verbose=None)
evoked.plot_topomap(vmin=-4e6,vmax=4e6, cmap='jet',
                    size=2,contours=None,time_format='',colorbar=False,axes=axes[0,:],show=False);

icaSources = ica.get_sources(epochs).get_data()
freq, icaSrcSpec = spsig.periodogram(icaSources,axis=2,fs=250)
icaSrcSpecGrand = np.mean(icaSrcSpec,axis=0)
for comp in range(9):
    axes[1,comp].plot(freq, icaSrcSpecGrand[comp,:])
    axes[1,comp].set_xlim([0,20])
ImmanuelSamuel commented 8 years ago

The way I am plotting now with evoked makes sense in a different scenario. I am just copying code from another analysis. Yes this is a very round-about way of doing things lol

dengemann commented 8 years ago

I'm not sure I follow, the function suggested by Jona is essentially what ice.plot_components does, it maps the ICA components to the PCA space and and from there to the channels. What exactly would you like to get?

On Sun, Jan 10, 2016 at 6:34 AM, ImmanuelSamuel notifications@github.com wrote:

The way I am plotting now with evoked makes sense in a different scenario. I am just copying code from another analysis. Yes this is a very round-about way of doing things lol

— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2404#issuecomment-170316260 .

agramfort commented 8 years ago

here is my attempt to help you

please have a look at coding style and rules such as parameter naming.

I don't see any obvious bug.

you'll see the use of scalings as your maps don't have the same norm and it is arbitrary.

import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
import mne
chanpath = '.'

# create montage - read data from SFP format
montage = mne.channels.read_montage('chanloc', ch_names=None, path=chanpath,
                                    unit='mm', transform=False)
chanNames = montage.ch_names
chNo = len(chanNames)
montage.ch_names = [str(each) for each in np.arange(chNo)]

# create info for mne epochs
info = mne.create_info([str(each) for each in np.arange(chNo)],
                       250, ch_types='eeg', montage=montage)

events = np.load('events.npy')
data = np.load('data.npy')
baseline = None
epochs = mne.EpochsArray(data, info, events, tmin=-0.5, event_id=None,
                         reject=None, flat=None, reject_tmin=None,
                         reject_tmax=None, baseline=baseline, verbose=False)

ica = mne.preprocessing.ica.ICA(n_components=0.99,
                                method='fastica').fit(epochs)

fig, axes = plt.subplots(nrows=2, ncols=9, figsize=(20, 7))
maps = mne.preprocessing.ica._get_ica_map(ica).T
scalings = np.linalg.norm(maps, axis=0)
maps /= scalings[None, :]
info_temp = mne.create_info([str(each) for each in np.arange(81)], 1,
                            ch_types='eeg', montage=montage)
evoked = mne.EvokedArray(maps, info_temp, 0, comment='', nave=1,
                         kind='average', verbose=None)
evoked.plot_topomap(vmin=-0.6e6, vmax=0.6e6, cmap='jet',
                    size=2, contours=None, time_format='', colorbar=False,
                    axes=axes[0, :], show=False)

ica_sources = ica.get_sources(epochs).get_data()
ica_sources *= scalings[:, None]
freq, ica_src_spec = signal.periodogram(ica_sources, axis=2, fs=250)
ica_src_spec_grand = np.mean(ica_src_spec, axis=0)
for comp in range(9):
    axes[1, comp].plot(freq, ica_src_spec_grand[comp, :])
    axes[1, comp].set_xlim([0, 20])
    axes[1, comp].set_ylim([0, 2.8])

plt.tight_layout()

ica.get_sources(epochs).plot(scalings=dict(misc=5))

plt.show()
ImmanuelSamuel commented 8 years ago

What I said was that the maps and the sources do not seem to match. The maps show that the last component is eye blink. The spectrum of the sources seem to suggest that the 4th source is the eye blink.

@agramfort Thank you. Did the code you wrote show matching maps and sources?

ImmanuelSamuel commented 8 years ago

@agramfort I just ran your code and I see the same thing. The maps and the sources don't seem to match.

dengemann commented 8 years ago

Maybe some problem with your periodogram? I am certain that we don't have a bug at the ICA level, as I have used this code for hundreds of recordings and compared time-series with topographies. Can you instead plot the time-series rather than the periodogram?

ImmanuelSamuel commented 8 years ago

Or maybe this is correct. I always thought that the eyeblink has to have the highest power. Here the 8th and 9th ICA components are nearly 0 but still seem like the eye blink in the topo map.

The sources and the Spectrums look like they match its just the topo I am concerned about.

image

dengemann commented 8 years ago

Have you tried to find the EOG onsets from these components and count the number of artifacts? Maybe your EOG is just not dominant. The topomap does not tell you a lot about the power.

ImmanuelSamuel commented 8 years ago

Yeah I tried this in different segments of data and results match there. I guess as you said EOG is not dominant in this subject data. Also info max gives me max power for eye blink compared to fastica. Just FYI dont know why though. (ONly think I changed was method and immediately eog went from 1st place to 4th component)

dengemann commented 8 years ago

Yeah I tried this in different segments of data and results match there. I guess as you said EOG is not dominant in this subject data.

but can you tell me how many EOG artifacts you find for this subject in total? Can you plot the average EOG artifact?

Also info max gives me max power for eye blink compared to fastica.

What does this mean? Ordering of the components is not necessarily meaningful in fastica btw.

jona-sassenhagen commented 8 years ago

@ImmanuelSamuel , try this

ica_epochs = ica.get_sources(epochs)
for comp in range(ica.n_components_):
    mne.viz.plot_epochs_image(ica_epochs, comp)
    ica.plot_components(comp)

You will see that the components with the most blink-like topography also have blinks in single-trial view. The data is completely fine and the functions are working completely fine.

You can check that _get_ica_map returns components in the correct order by doing this:

from mne.viz.utils import _prepare_trellis
pos = mne.find_layout(info).pos[:,:2]
maps = _get_ica_map(ica)
fig, axes = _prepare_trellis(10, max_col=10);

for ax, ic_map in zip(axes, maps):
    mne.viz.topomap.plot_topomap(ic_map, pos, axis=ax, show=False);

You will see that this will have the same result as ìca.plot_components()`.

A few general recommendations:

  1. always record with infraocular EOG channels. They make artefact detection and correction much easier.
  2. the best way to identify or verify blink ICs is looking at their continuous or epochs data.
  3. IMO you get better ICA results fitting a few more components than 10 for 81 channels. ICA assumes there are only as many sources as channels. We know this is false for EEG: there are many more sources than e.g. 81. But ICA "degrades gracefully" in the sense that the residual variance is taken up by the less important components. However, if you do not leave much room for this variance (e.g. by fitting too few components), you will contaminate even the best components. (This may be subject to experimental falsification.)

Now for the main source of confusion: I think you are under the assumption that the order in which ICA returns ICs is sorted by their "Importance". Is that what you mean by "power"? In that case, as Denis notes, you should know that while this is how Infomax works (returning components sorted by their % of explained variance), fastica doesn't; fastica returns components in random order.

Does that settle the issue?

ImmanuelSamuel commented 8 years ago

OK. What I meant by power was spectral power. The eye blinks have large power and hence account for more variance in the data. In the image above the map that looks like eye blink (the last two maps) are nearly non-existent when you look at the sources. Their spectral power is also very very low. I just didn't expect that to happen.

@jona-sassenhagen I am not fitting just 10 components here right? There are essentially 81 components and I am getting the top n components based on the variance explained which I get by looking at 'ica.pca_ncomponents'. Here I am accounting for 0.999 variance. If you see the last two components their sources are essentially non existent. Their spectral power is like 1e-7 or something while the other are in the order of 1

@dengemann I tried to do that before but I couldn't get past adding events to stim channel. I am using just data arrays. I am not reading it from the recording. I think to get the EOG onsets I need raw data. For that I need to add some stim events but I got a message saying no stim channels found. THis is because I am making the raw array myself. Right now all I wanted to do was to remove the eye-blink ica but was getting too complex so just left that. Seemed like too much work for something simple. I am just doing some exploratory stuff for now of other people's data so I can't complain about not having full dataset for now.

ImmanuelSamuel commented 8 years ago

I guess I can close this issue now. Please let me know if I was wrong in my previous comment if you guys have time. Thanks!

jona-sassenhagen commented 8 years ago

How do you think could things be improved in the future for people in a situation similar to you?