mickcrosse / mTRF-Toolbox

A MATLAB package for modelling multivariate stimulus-response data
https://cnspworkshop.net
BSD 3-Clause "New" or "Revised" License
78 stars 29 forks source link

Matlab reports an error that Matrix is close to singular or badly scaled #11

Open hasibagen opened 2 years ago

hasibagen commented 2 years ago

Describe the bug

A clear and concise description of what the bug is. When mTRF-Toolbox is cross-validating discrete variables (e.g. Phonetic feature), Matlab reports an error that Matrix is close to singular or badly scaled. Is there any way to fix or improve this error? Does it have a big impact on the results?​

To Reproduce Steps to reproduce the behavior:

  1. mTRFcrossval
  2. Input data size and type [e.g. stim =9920x19 double array]
  3. Input data contains '...' [e.g. 0 ,1 ] sitm: load('./data/LalorNatSpeech/dataCND/dataStim.mat','stim'); phon_feat = stim.data{3,1};

TRFtutorial_examples.m.zip

I ran lines 1 to 456 of this tutorial document and it reports this error. I trained the encoding model with phoneme features on my own data and it also reports the same error.

Expected behavior A clear and concise description of what you expected to happen.

Screenshots Training/validating model

0/16 [ ] Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 5.577082e-13.

In mTRFcrossval (line 242)

Desktop (please complete the following information):

alexisdmacintyre commented 2 years ago

I came here to log the same issue, with this warning message appearing whatever stimulus/response I use.

hasibagen commented 2 years ago

I came here to log the same issue, with this warning message appearing whatever stimulus/response I use.

I would like to ask if you have written an email to the developers to give feedback? I did write an email, but I didn't receive a reply. I don't know if there is something wrong with my part.

diliberg commented 2 years ago

Hi all,

Apologies for the delays in replying. I am afraid I can't look into it right now because of other commitments. I'll get back to you about this. For now, let me mention that this warning usually arises because of issues with the scale of the stimulus or EEG signal. For example, you should be able to replicate the issue if you multiply your stimulus by a very large number. My advice is apply appropriate normalisation/standardisation criteria at the preprocessing stage. We can discuss this further.

The warning can emerge when running this line "w = (Cxx + M)\Cxy/delta;". The "Backslash or left matrix divide" (mldivide.m) is the function I am talking about. Hence, the issue is related to the inversion of the matrix. If you check the rank of Cxx in the cases where you see the warning, you should also notice that Cxx is not full rank i.e., rank(Cxx) < size(Cxx,1), which is an issue for matrix inversion. However, even without delving into the theory, the problem should not arise if input and output are appropriately scaled (e.g., normalisation could be one way to do that, but that should be done carefully and discussed in the context of your specific dataset) and if the lambda range is appropriate. Note that continuous stim and eeg data with std close to 1 usually has best lambda around 1. So a range centred in 1 would suffice (e.g., [1e-2,1e-1,1,1e1,1e2,1e3]). But make sure that the optimal lambda doesn't saturate (not always maximum or minimum in the range).

Apologies if I don't have the time to go into more detail. If you want to know more about this, please ask more questions here or join the CNSP-workshop, where we will have tutorials and Q/A sessions on various topics around TRF analyses and the mTRF-Toolbox (see https://cnspworkshop.net).

Thank you. Kind regards, Giovanni