mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.67k stars 1.31k forks source link

CSP error: LinAlgError : The leading minor of order 64 of B is not positive definite. #9094

Closed FriederikeScholten closed 5 months ago

FriederikeScholten commented 3 years ago

Describe the bug

This issue has been discussed on the Forum here with the suggestion to open a bug report.

I face the following error when running the CSP function using the EEG BCI dataset:

LinAlgError : The leading minor of order 64 of B is not positive definite. The factorization of B could not be completed and no eigenvalues or eigenvectors were computed.

Steps to reproduce

import numpy as np from mne.decoding import CSP

loaded = np.load('preprocessingoutput.npz') CSP(n_components=5, reg=1e-4).fit(loaded['x1'], loaded['y1'])

preprocessingoutput.npz and the preprocessing steps run on the EEG data are provided here.

Expected results

running CSP analysis.

Actual results

LinAlgError : The leading minor of order 64 of B is not positive definite. The factorization of B could not be completed and no eigenvalues or eigenvectors were computed.

Additional information

mne.sys_info() Platform: Windows-10-10.0.19041-SP0 Python: 3.8.3 (default, Jul 2 2020, 17:30:36) [MSC v.1916 64 bit (AMD64)] Executable: C:\Users\????\Anaconda3\python.exe CPU: Intel64 Family 6 Model 142 Stepping 10, GenuineIntel: 8 cores Memory: 7.9 GB

mne: 0.22.0 numpy: 1.18.5 {blas=mkl_rt, lapack=mkl_rt} scipy: 1.5.4 matplotlib: 3.3.3 {backend=module://ipykernel.pylab.backend_inline}

sklearn: 0.23.2 numba: 0.50.1 nibabel: Not found nilearn: Not found dipy: Not found cupy: Not found pandas: 1.0.5 mayavi: Not found pyvista: Not found vtk: Not found

agramfort commented 3 years ago

@FriederikeScholten sorry for the slow reaction time. I just tried your script on my system and I had no issue.

as you have a github repo you coudl setup GH actions with windows to test the script on a different windows system?

FriederikeScholten commented 3 years ago

@agramfort thanks for your suggestion. I set up the GH action and it still gives the same error when using windows-latest runner, but it works without a problem with an ubunto-latest runner.

FriederikeScholten commented 3 years ago

As an addition, a colleague ran the code on the Window system and in his WSL. The error pops up in the former, but not in the latter system.

agramfort commented 3 years ago

can you show me the log on GH action failure build?

FriederikeScholten commented 3 years ago

@agramfort I added the GH actions (one with windows and one with ubuntu) in my repository where you can see the failure for the windows version.

agramfort commented 3 years ago

do you see the same traceback on all systems:

Computing rank from data with rank=None 25 Using tolerance 0.00016 (2.2e-16 eps 64 dim 1.1e+10 max singular value) 26 Estimated rank (mag): 63 27 MAG: rank 63 computed from 64 data channels with 0 projectors 28 Setting small MAG eigenvalues to zero (without PCA) 29Reducing data rank from 64 -> 63 30Estimating covariance using SHRINKAGE 31Done. 32Computing rank from data with rank=None 33 Using tolerance 0.00017 (2.2e-16 eps 64 dim 1.2e+10 max singular value) 34 Estimated rank (mag): 63 35 MAG: rank 63 computed from 64 data channels with 0 projectors 36 Setting small MAG eigenvalues to zero (without PCA) 37Reducing data rank from 64 -> 63 38Estimating covariance using SHRINKAGE

sviter commented 3 years ago

I have the same problem. Any progress with this bug? I also can reproduce it on my Windows

larsoner commented 3 years ago

I cannot replicate on my Windows machine when creating a fresh env with conda env create -n mne -f environment.yml:

Platform:      Windows-10-10.0.19043-SP0
Python:        3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:37:25) [MSC v.1916 64 bit (AMD64)]
Executable:    C:\Users\larsoner\Anaconda3\envs\mne\python.exe
CPU:           Intel64 Family 6 Model 158 Stepping 9, GenuineIntel: 8 cores
Memory:        64.0 GB

mne:           0.24.dev0
numpy:         1.21.2 {blas=NO_ATLAS_INFO, lapack=lapack}
scipy:         1.7.1
matplotlib:    3.4.3 {backend=Qt5Agg}

sklearn:       0.24.2
numba:         0.53.1
nibabel:       3.2.1
nilearn:       0.8.0
dipy:          1.4.1
cupy:          Not found
pandas:        1.3.2
mayavi:        4.7.2
pyvista:       0.31.3 {pyvistaqt=0.5.0, OpenGL 4.5.0 NVIDIA 466.77 via NVIDIA GeForce GTX 650 Ti BOOST/PCIe/SSE2}
vtk:           9.0.1
PyQt5:         5.12.3

This seems to me more and more likely to be an Intel MKL bug, which will make it tough to fix.

Can you try with the environment variables MKL_NUM_THREADS=1 and then MKL_NUM_THREADS=2 to see if either works?

Looks like the logs are lost in your CI, and at least the last couple were green: https://github.com/FriederikeScholten/CSP-error-LinAlgError/actions . Can you push a commit (can be empty) so that I can look at the logs?

cbrnr commented 2 years ago

I think our CSP implementation has a problem with rank deficiencies. This problem keeps popping up (e.g. https://mne.discourse.group/t/csp-returning-linalgerror/3567), and I just stumbled into this error myself with a completely different dataset on my Mac. Therefore, I guess it's safe to say that this is not a Windows issue.

It seems like performing PCA as a preprocessing step (and discarding some components) resolves the issue. I think we should mention this in the CSP docs, including an example how to actually perform PCA (which seems to require UnsupervisedSpatialFilter(PCA) gymnastics that are far from obvious). Or maybe this is just a workaround and we should try to fix the underlying problem, not sure.

agramfort commented 2 years ago

+1

kolcs commented 2 years ago

I think I found one reason which can evoke the problem. If some of the channels are interpolated the CSP fails. I even tried the multi class CSP problem.

Code to reproduce the problem: (remove the comment in the event_id if you want to test the multi class problem)

import numpy as np
from mne import Epochs, pick_types, events_from_annotations
from mne.channels import make_standard_montage, make_eeg_layout
from mne.datasets import eegbci
from mne.decoding import CSP
from mne.io import concatenate_raws
from mne.io.edf import read_raw_edf
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.model_selection import StratifiedKFold
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

tmin, tmax = 1., 2.
event_id = dict(hands=2, feet=3,  
                # rest=0  # uncomment it for testing the multiclass CSP
                )
subject = 7
runs = [6, 10, 14]  # motor imagery: hands vs feet

def train_test_on_raw(raw, picks):
    events, _ = events_from_annotations(raw, event_id=dict(T1=2, T2=3, T0=0))

    epochs = Epochs(
        raw,
        events,
        event_id,
        tmin,
        tmax,
        proj=True,
        picks=picks,
        baseline=None,
        preload=True,
        verbose=False)
    labels = epochs.events[:, -1]

    # cross validation
    cv = StratifiedKFold(n_splits=5, shuffle=True)
    # get epochs
    epochs_data_train = 1e6 * epochs.get_data()

    # Assemble a classifier
    csp = CSP(n_components=4, reg='ledoit_wolf', log=True)

    clf = make_pipeline(
        csp,
        StandardScaler(),
        SVC()
    )

    scores = []
    for train, test in cv.split(np.arange(len(epochs_data_train)), labels):
        train_x = epochs_data_train[train]
        train_y = labels[train]
        test_x = epochs_data_train[test]
        test_y = labels[test]

        clf.fit(train_x, train_y)
        y_pred = clf.predict(test_x)
        class_report = classification_report(test_y, y_pred)
        conf_matrix = confusion_matrix(test_y, y_pred)
        acc = accuracy_score(test_y, y_pred)
        print(class_report)
        print("Confusion matrix:\n%s\n" % conf_matrix)
        print("Accuracy score: {}\n".format(acc))
        scores.append(acc)

    print("Accuracy scores for k-fold crossvalidation: {}\n".format(scores))
    print(f"Avg accuracy: {np.mean(scores):.4f} +/- {np.std(scores):.4f}")

def bug():
    raw_files = [
        read_raw_edf(f, preload=True) for f in eegbci.load_data(subject, runs)
    ]
    raw = concatenate_raws(raw_files)
    eegbci.standardize(raw)
    try:  # check available channel positions
        make_eeg_layout(raw.info)
    except RuntimeError:  # if no channel positions are available create them from standard positions
        montage = make_standard_montage('standard_1005')  # 'standard_1020'
        raw.set_montage(montage, on_missing='warn')

    picks = pick_types(
        raw.info, meg=False, eeg=True, stim=False, eog=False, exclude='bads')

    # Apply band-pass filter
    raw.filter(7., 35., method='iir', picks=picks)

    # OK:
    train_test_on_raw(raw, picks)
    print('\n\nDone with CSP without interpolation........ \n\n')

    # bug in CSP after interpolating data....
    raw_cp = raw.copy()
    raw_cp.info['bads'].extend(['Fp1', 'Cz', 'P2'])
    raw_cp.interpolate_bads()

    train_test_on_raw(raw_cp, picks)

if __name__ == '__main__':
    bug()

mne.sys_info()

Backend TkAgg is interactive backend. Turning interactive mode on.
Platform:         Windows-10-10.0.22000-SP0
Python:           3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)]
Executable:       C:\Programs\Miniconda3\envs\bci\python.exe
CPU:              Intel64 Family 6 Model 85 Stepping 7, GenuineIntel: 24 cores
Memory:           Unavailable (requires "psutil" package)
mne:              1.0.dev0
numpy:            1.19.5 {blas=D:\\a\\1\\s\\numpy\\build\\openblas_info, lapack=D:\\a\\1\\s\\numpy\\build\\openblas_lapack_info}
scipy:            1.7.1
matplotlib:       3.4.3 {backend=TkAgg}
sklearn:          1.0
numba:            Not found
nibabel:          Not found
nilearn:          Not found
dipy:             Not found
cupy:             Not found
pandas:           1.3.4
pyvista:          Not found
pyvistaqt:        Not found
ipyvtklink:       Not found
vtk:              Not found
PyQt5:            Not found
ipympl:           Not found
pooch:            v1.6.0
mne_bids:         Not found
mne_nirs:         Not found
mne_features:     Not found
mne_qt_browser:   Not found
mne_connectivity: Not found
agramfort commented 2 years ago

thx @kcsaba12 for the detailed bug report. I managed then to look into it. Indeed when you interpolate you have rank reduced data. you could imagine that the regularization of the cov fixes the pb but MNE is a bit too smart about data rank by default. Basically is only applied the cov estimation in the subspace where the data lives. You have two options:

Use PCA

import numpy as np
from mne import Epochs, pick_types, events_from_annotations
from mne.channels import make_standard_montage, make_eeg_layout
from mne.datasets import eegbci
from mne.decoding import CSP, UnsupervisedSpatialFilter
from mne.io import concatenate_raws
from mne.io.edf import read_raw_edf
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.model_selection import StratifiedKFold
from sklearn.pipeline import make_pipeline
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

tmin, tmax = 1., 2.
event_id = dict(hands=2, feet=3,
                # rest=0  # uncomment it for testing the multiclass CSP
                )
subject = 7
runs = [6, 10, 14]  # motor imagery: hands vs feet

def train_test_on_raw(raw, picks):
    events, _ = events_from_annotations(raw, event_id=dict(T1=2, T2=3, T0=0))

    epochs = Epochs(
        raw,
        events,
        event_id,
        tmin,
        tmax,
        proj=True,
        picks=picks,
        baseline=None,
        preload=True,
        verbose=False)
    labels = epochs.events[:, -1]

    # cross validation
    cv = StratifiedKFold(n_splits=5, shuffle=True)
    # get epochs
    epochs_data_train = 1e6 * epochs.get_data()

    # Assemble a classifier
    csp = CSP(n_components=4, reg='ledoit_wolf', log=True)

    clf = make_pipeline(
        UnsupervisedSpatialFilter(PCA(61), average=False),
        csp,
        StandardScaler(),
        SVC()
    )

    scores = []
    for train, test in cv.split(np.arange(len(epochs_data_train)), labels):
        train_x = epochs_data_train[train]
        train_y = labels[train]
        test_x = epochs_data_train[test]
        test_y = labels[test]

        clf.fit(train_x, train_y)
        y_pred = clf.predict(test_x)
        class_report = classification_report(test_y, y_pred)
        conf_matrix = confusion_matrix(test_y, y_pred)
        acc = accuracy_score(test_y, y_pred)
        print(class_report)
        print("Confusion matrix:\n%s\n" % conf_matrix)
        print("Accuracy score: {}\n".format(acc))
        scores.append(acc)

    print("Accuracy scores for k-fold crossvalidation: {}\n".format(scores))
    print(f"Avg accuracy: {np.mean(scores):.4f} +/- {np.std(scores):.4f}")

def bug():
    raw_files = [
        read_raw_edf(f, preload=True) for f in eegbci.load_data(subject, runs)
    ]
    raw = concatenate_raws(raw_files)
    eegbci.standardize(raw)
    try:  # check available channel positions
        make_eeg_layout(raw.info)
    except RuntimeError:  # if no channel positions are available create them from standard positions
        montage = make_standard_montage('standard_1005')  # 'standard_1020'
        raw.set_montage(montage, on_missing='warn')

    picks = pick_types(
        raw.info, meg=False, eeg=True, stim=False, eog=False, exclude='bads')

    # Apply band-pass filter
    raw.filter(7., 35., method='iir', picks=picks)

    # OK:
    train_test_on_raw(raw, picks)
    print('\n\nDone with CSP without interpolation........ \n\n')

    # bug in CSP after interpolating data....
    raw_cp = raw.copy()
    raw_cp.info['bads'].extend(['Fp1', 'Cz', 'P2'])
    raw_cp.interpolate_bads()

    train_test_on_raw(raw_cp, picks)

if __name__ == '__main__':
    bug()

Alternatively you should be able to specify the rank manually in CSP so MNE does not try to be too smart... to be tested

donalbertico commented 1 year ago

this error is not a bug, the problem is applying preprocessing techniques reduces the data rank and leads to mathematical incorrectness for CSP. Projecting back the data to the number of channels with PCA is a solution as shown by @agramfort. Refer here for more info https://www.researchgate.net/publication/343942514_Potential_pitfalls_of_widely_used_implementations_of_common_spatial_patterns

moritz-gerster commented 1 year ago

I have the same problem and I am a bit surprised about it because I have 22 LFP channels which I did not interpolate. Therefore I thought my data has full rank.

Can one of you experienced users explain to me how I choose the rank for my CSP or how I choose the number of PCA components to make it work? @cbrnr @donalbertico @agramfort

Of course I can also do my research first in case the answer is not so simple and depends on the use case.

cbrnr commented 1 year ago

Are you using an average reference?

moritz-gerster commented 1 year ago

Are you using an average reference?

Yes, I use a local average reference. However, I exclude the reference channel from my calculation.

cbrnr commented 1 year ago

Average referencing reduces the rank by one. I'm not sure what you mean by excluding your reference channel, but doing PCA and retaining N - 1 components prior to CSP should usually work.

moritz-gerster commented 1 year ago

Thanks for your help @cbrnr, I will try!

However, let's say I record 3 channels with a 4th reference channel and then I apply a common average reference. Are you saying the rank of my data is not 3 but only 2?

I only feed 3 channels to the CSP, that's what I meant by excluding the reference channel.

cbrnr commented 1 year ago

Usually you don't have the reference channel explicitly in your data – it's all zero by definition. When you average reference the three "active" channels, then yes, the rank is reduced to two. Including the all-zero reference doesn't really change anything regarding the ranks, average referencing still reduces it by one. But in this case, you reduce it from 4 to 3. I think in either case you should do PCA and discard the last PC.

cbrnr commented 1 year ago

Here's some code to show what I mean (hope you don't mind it's Julia):

using LinearAlgebra, Statistics

x1 = [1 2 3; 4 0 6; 7 8 9]
rank(x1)  # 3

x1_ar = x1 .- mean(x1, dims=2)  # average reference
rank(x1_ar)  # 2

x2 = hcat(x1, [0, 0, 0])  # includes reference channel
rank(x2)  # 3 already here!

x2_ar = x2 .- mean(x2, dims=2)  # average reference
rank(x2_ar)  # 3
moritz-gerster commented 1 year ago

When you average reference the three "active" channels, then yes, the rank is reduced to two.

But don't you show in your code that the rank remains 3?

import numpy as np

A = np.array([[1, 2, 3], [4, 0, 6], [7, 8, 9]])
np.linalg.matrix_rank(A) # 3

A_mean = A - A.mean(1)
np.linalg.matrix_rank(A_mean) # 3

So if you have 4 electrodes, 3 active ones, and 1 reference channel, the rank is always 3 - whether you use a bipolar reference, a monopolar reference, or a common average reference, correct?

Since I excluded my reference channel I considered my data full rank. We probably misunderstood each other. In any case, I will try your PCA trick!

cbrnr commented 1 year ago

But don't you show in your code that the rank remains 3?

No, it shows that average referencing reduces it to 2:

x1 = [1 2 3; 4 0 6; 7 8 9]
rank(x1)  # 3

x1_ar = x1 .- mean(x1, dims=2)  # average reference
rank(x1_ar)  # 2

If you include your reference channel in the data, you have four channels, but rank 3:

x2 = hcat(x1, [0, 0, 0])  # includes reference channel
rank(x2)  # 3 already here!

Now average referencing does not reduce the rank any more, it remains 3:

x2_ar = x2 .- mean(x2, dims=2)  # average reference
rank(x2_ar)  # 3

I think in your Python code you need to transpose, because the channels correspond to columns (and rows to time):

A_mean = (A.T - A.mean(1)).T
np.linalg.matrix_rank(A_mean) # 2
cbrnr commented 1 year ago

See also this post on the EEGLAB mailing list: https://sccn.ucsd.edu/pipermail/eeglablist/2020/015722.html

moritz-gerster commented 1 year ago

Thank you for explaining! 😊

ivopascal commented 7 months ago

I've been struggling with the same problem. I have a fix that works, but that I do not understand. I have rank reduced EEG data after applying ICA, but even when passing the correct rank to CSP I either get the error in the title (2 classes) or "SVD did not converge" (3+ classes).

I noticed that for the subjects where it would work MNE would log: Computing rank from data with rank={'eeg': 28, 'mag': 32}

For subjects where it would fail it would instead log: Computing rank from data with rank={'eeg': 27, 'mag': 27}

I do not have any MAG channels, but figured this was causing the problem. My solution was to set rank with:

rank = {
      'eeg': X.shape[1] - ica.exclude -1 , # Another -1 for Common Average Reference
      'mag': 32
  }

Everything works now. CSP filters look good and classification is good, although I have no clue why. If doing PCA is not an option because you want to visualise the CSP filters, this might be a workable hack.

cbrnr commented 7 months ago

@ivopascal you are saying you don't have MAG channels, does that mean your signal does not contain MAG channels or you are just not using them in your ICA?

ivopascal commented 7 months ago

I don't have any MAG channels. I'm only recording EEG, EOG and EMG. MAG is not in my dataset, but somehow ICA tries to set a rank for it, and somehow specifying a rank for MAG makes CSP work again.

Without manually setting the rank for MAG:

Computing rank from data with rank={'eeg': 30}
    Using tolerance 41 (2.2e-16 eps * 32 dim * 5.8e+15  max singular value)
    Estimated rank (mag): 31
    MAG: rank 31 computed from 32 data channels with 0 projectors
    Setting small MAG eigenvalues to zero (without PCA)
Reducing data rank from 32 -> 31
Estimating covariance using SHRINKAGE
Done.
Computing rank from data with rank={'eeg': 30, 'mag': 31}
    Setting small MAG eigenvalues to zero (without PCA)
[...]

Played around with this some more: setting a lower value for MAG (<=31) brings back the bug. I have 32 EEG channels.

cbrnr commented 7 months ago

This is really weird! If you don't have any MAG channels, the output of the rank doesn't make sense. Can you maybe share this dataset together with a minimal reproducible example that we could run?

ivopascal commented 6 months ago

Here's a minimal reproducible example as a Colab notebook:

https://colab.research.google.com/drive/1LBq8Sg-30Cnlj9sYLPtEcsPIyBtsvxF8?usp=sharing

Motor Imagery data is downloaded from EEGBCI dataset, ICA is applied and 4 arbitrary ICs are removed. CSP fails because the data is rank deficient, setting rank={"eeg": 60} does not fix it, but setting rank={"eeg": 60, "mag": 64} does, even though there is no MAG data.

cbrnr commented 6 months ago

For some reason, regularization uses MAG channels: https://github.com/mne-tools/mne-python/blob/main/mne/cov.py#L2104-L2105

Even if this is the right thing to do, it should probably not add MAG channels publicly (i.e. visible in the log), because this is very confusing. This doesn't answer the actual question why regularization only works if you set rank={"eeg": 60, "mag": 64} though.

@larsoner do you agree that there's something fishy going on? Or are we just not using CSP with regularization correctly? I mean, ideally, it should just work with csp = CSP(n_components=4, reg='shrinkage') IMO.

larsoner commented 6 months ago

Regularization with no rank supplied should work, correctly specified rank with no regularization should (ideally) work, and especially correctly specified rank with regularization should work. I can hopefully look today

larsoner commented 6 months ago

@ivopascal feel free to try #12476