scot-dev / scot

EEG/MEG Source Connectivity Toolbox in Python
http://scot-dev.github.io/scot-doc/index.html
MIT License
58 stars 23 forks source link

Example implementation of SCoT with EEGLAB in MATLAB #49

Closed pntfx4 closed 9 years ago

pntfx4 commented 9 years ago

Can you suggest a means of utilizing this toolbox for ICA decomposition by way of EEGLAB in MATLAB.

Currently, to run the binary ICA the following code is executed: EEG = pop_runica(EEG,'icatype','binica','dataset',1,'options',{'extended',1},'chanind',1:size(EEG.chanlocs,2));

My preference would be to instead utilize SCoT (particularly as it appears possible to specify a non-random seed), having the python functions return the correct information to the EEGLAB structure.

An example of such an implementation would be very helpful!

mbillingr commented 9 years ago

This is going to be difficult, at least on the Matlab side. I am not sufficiently experienced with EEGLAB to give an example, but there could be different ways to achieve what you want:

If you are experienced with EEGLAB it would be much appreciated if you could contribute such an example.

pntfx4 commented 9 years ago

EEGLAB stores everything within a MATLAB structure. Data is in EEG.data Number of points is in EEG.pnts or can be computed with EEG.pnts = size(EEG.data,2); Number of channels is in EEG.nbchan Number of epochs is in EEG.trials

It should just be a matter of getting the right data to the scot routines and then returning the correct information to the EEG structure (EEG.icaact, EEG.icaweights, EEG.icasphere, EEG.icawinv).

It appears that MATLAB 2014b has a python implementation natively, but 2014a does not. Not sure which version you have, but that does add a wrinkle. In 2014b, there is some documentation here (http://www.mathworks.com/help/matlab/getting-started_buik_wp-3.html) and there are already routines for passing data between MATLAB and python (http://www.mathworks.com/help/matlab/data-types_buik_wp-4.html)

pntfx4 commented 9 years ago

After working on it yesterday in 2014b, it is actually fairly easy to implement python from MATLAB. For the most part its simply a matter of prefixing any code with 'py.' as long as that python code is in the system path. From the perspective of making scot easily usable as an EEGLAB plugin, EEGLAB can parse plugin files and automatically add paths to MATLAB. From there the paths can be inserted into the python system path. This appears to only temporarily add them, so if the plugin was removed it would not negatively impact the path list.

So a Matlab .m file in the main scot-master directory titled: 'eegplugin_scot.m' should have the following code:

            function eegplugin_scot(fig, try_strings, catch_strings)

                % add main folder to matlab path.
                fullpath = which('eegplugin_scot');
                currentmatlabpaths = strsplit(genpath(fullpath(1:end - length('eegplugin_scot.m'))),':');
                addpath(genpath(fullpath(1:end - length('eegplugin_scot.m'))));

                % Transfer matlab paths to python
                currentpythonpaths = cellfun(@char,cell(py.sys.path),'UniformOutput',false)';
                for cP = 1:size(currentmatlabpaths,1)
                    if ~(strcmpi(currentmatlabpaths{cP},currentpythonpaths))
                        insert(py.sys.path,int32(0),currentmatlabpaths{cP})
                    end
                end

            end

This would allow python to be able to find the scot folder within the EEGLAB plugins folder (where users add functionality to EEGLAB).

For instance: emptyset = py.list(int32(zeros(1,4)))

My thought is that it would be best to first code the ideal 2014b solution, then see how it could be done in prior versions which do not have the matlab-python connections built in.

The trick then is that MATLAB will not send a R x C matrix to python. It must be a 1 x C vector. So it would be helpful to see what they standard python progression is to understand how best to format the input from MATLAB.

pntfx4 commented 9 years ago

From a compatibility standpoint with previous MATLAB versions (or frankly compatibility with any other software) it might be worth considering attempting a an indirect data transfer approach.

Matlab could output the necessary data into a binary float format, and then make a call to trigger a python routine. That routine would read in the binary file, send the data to the ica function, then write the outputs to another binary file that the Matlab function could read in. This way all the calling software would have to do is be able to trigger the python function and it would not be necessary to worry about exchanging data between coding environments.

The first 3 numbers of the binary file could contain the number of points, channels, and trials. Enabling for restructuring the data into the correct matrix array. Similarly, other parameter information could also be embedded into the binary stream.

mbillingr commented 9 years ago

Sorry for my delayed answers, and thank you for putting so much effort into this issue.

I don't have access to Matlab 2014, so I can only guess about the Python interface from the online documentation. It seems that Matlab automatically imports Python modules, which is a bit weird but probably won't cause any problems.

SCoT represents data with numpy arrays, which are very similar to Matlab matrices. There might be an easy way to pass a RxC matrix as a numpy array to Python. Another possibility could be to pass the matrix as a list of column vectors, which is relatively easy to work with in Python.

A compatibility function for older Matlab versions could look somewhat like this:

function EEG = scotica(EEG)
    % Call `scotscript.py`, which loads
    save('tmp_data.mat', EEG.data)
    !python scotscript.py tmp_data.mat tmp_result.mat
    r = load('tmp_result.mat');
    EEG.icaweights = r.unmixing;
    EEG.icawinv = r.mixing;
    EEG.icasphere = 1;   % may need to be set to identity matrix instead
    EEG.icaact = data * EEG.icaweights;  % not sure on that one

The python script scotscript.py could look like this:

import sys
from scot.matfiles import loadmat, savemat
from scot.ooapi import Workspace

infile = sys.argv[1]
outfile = sys.argv[2]

inmat = loadmat(infile)

ws = Workspace({'model_order': 35})
ws.set_data(inmat['data'])
ws.do_mvarica()

result = {'mixing': ws.mixing_,
          'unmixing': ws.unmixing_}
savemat(outfile, result)

(The code is not tested and won't probably work as it is.)

pntfx4 commented 9 years ago

The Matlab-Python interface seems to have issues trying to import packages. So it might be best to adopt the other strategy.

The python script I tweaked a bit and tested. Two items did pop up:

1) When running the basic doica(), python reports: C:\Python27\lib\site-packages\scot\external\infomax.py:184: RuntimeWarning: overflow encountered in exp y = 1.0 / (1.0 + np.exp(-u))

2) The data arrangement is different between EEGLAB and scot. That should also necessitate transposing the resulting matrices, correct? EEGLAB (R:channel x C:point) vs scot (R:point x C: channel)

    import sys, os
    from scot.matfiles import loadmat, savemat
    from scot.ooapi import Workspace

    # Get input/output info
    infile = sys.argv[1]
    outfile = sys.argv[2]

    # Read in matrix
    inmat = loadmat(infile)
    eegdata = inmat['EEGdata']

    # Rotate data matrix from EEGLAB (R:channel x C:point) to scot (R:point x C: channel) format
    scotdata = eegdata.T

    # Create workspace and load in data
    ws = Workspace({'model_order': eegdata.shape[0]}, reducedim=0)
    ws.set_data(scotdata)

    # Perform ICA
    if (sys.argv[3] == 'mvarica'):
        ws.do_mvarica()
    else:
        ws.do_ica()

    # Output results
    result = {'mixing': ws.mixing_.T,
              'unmixing': ws.unmixing_.T}
    savemat(outfile, result)
    os.remove(infile)
mbillingr commented 9 years ago

1) You can ignore the warning, which I believe is caused by initialization issues in the ICA's internals.

2) That looks correct. Luckily transposing is very cheap with numpy :)

Is there a reason why you set model_order to eegdata.shape[0]? This seems excessively high to me, and I'm not sure if it would work.

pntfx4 commented 9 years ago

In looking through the code I was not exactly sure what the model_order parameter did. At least for the standard ICA it would appear that this:

# Create workspace and load in data
ws = Workspace({'model_order': eegdata.shape[0]}, reducedim=0)
ws.set_data(scotdata)
ws.do_ica()

is effectively this since no pca is being performed:

u = infomax(scotdata).T
m = sp.linalg.pinv(u)
pntfx4 commented 9 years ago

Relevant to the ICA, in the infomax function, the documentation notes that the extended option is the default but the parameters specified are set to false. As currently implemented there does not appear to be a way to send alternative parameter information to the infomax function by way of the workspace.

After manually editing the infomax extended default to True, the python infomax function appears to return similar solutions to that of the Matlab implemented ICA/binary ICA functions when the data is transformed like this before sending to python: [Note: this was gleaned from pop_runica.m and runica.m]

% Increase precision
data = double(data);
data = data - repmat(mean(data,2), [1 size(data,2)]); % zero mean (more precise than single precision)

% Remove overall row means
rowmeans = mean(data,2)'; 
for iii=1:size(data,1) 
    data(iii,:)=data(iii,:)-rowmeans(iii);
end

% Perform sphering
sphere = 2.0*inv(sqrtm(double(cov(data')))); % find the "sphering" matrix = spher()
data = sphere*data; % decorrelate the electrode signals by 'sphereing' them

the results from python are then loaded into the EEGLAB EEG structure:

% update weight and inverse matrices etc
EEG.icachansind = 1:chans;
weights = result.unmixing;
if size(weights,1) == size(weights,2) % if weights are square 
  EEG.icawinv = inv(weights*sphere);
else
  EEG.icawinv = pinv(weights*sphere);
end
EEG.icaweights = weights;
EEG.icasphere = sphere;
EEG.icaact = (EEG.icaweights*EEG.icasphere)*reshape(EEG.data, EEG.nbchan, EEG.trials*EEG.pnts);
EEG.icaact = reshape( EEG.icaact, size(EEG.icaact,1), EEG.pnts, EEG.trials);
pntfx4 commented 9 years ago

The matlab bash method (!python scotscript.py tmp_data.mat tmp_result.mat) seems to work on windows. On OSX, it fails reporting: ImportError: No module named scot.matfiles

But OSX can run the same command from the terminal without any issue (python scotscript.py tmp_data.mat tmp_result.mat)

mbillingr commented 9 years ago

I was not aware that extended Infomax was disabled by default. That is something to consider worth fixing in the near future. Thank you for pointing out that oversight.

The problem in OSX could be caused by the fact that OSX comes with its' own python interpreter, which can interfer with user-installed python, from what I know. Try to make sure you are running the correct python interpreter, and that scot is in the path.

pntfx4 commented 9 years ago

Does the model order parameter impact anything for mvarica? It would also look like this:

ws = Workspace({'model_order': 35})
ws.set_data(scotdata)
ws.do_mvarica()

is the same as this: (but this allows for specifying the extended option)

r = scotdata - var.predict(scotdata)
u = infomax(r, extended=True).T
mbillingr commented 9 years ago

Yes. MVARICA performs the ICA decomposition on the residuals of a VAR model (of order model_order). Typically the data is pre-transformed with PCA, but this is not necessary if you want to get the maximum number of ICA sources.

I normally set the model order to a value that I need to use later in my application too. I do not know what happens if you are interested in the ICA decomposition and the model order is too high. Most likely you get bad results because the VAR model overfits and the residuals are too low.

Model order selection can be automated (this can be computationally demanding and only works if you have multiple trials):

var.optimize_order(scotdata)
r = scotdata - var.predict(scotdata)
u = infomax(r, extended=True).T
pntfx4 commented 9 years ago

@kazemakase when you get a chance, can you email me? My lab has an idea for a conference abstract that if you are interested in we would like you (and the rest of the scot team) to be a part of.

cbrnr commented 9 years ago

@pntfx4 do I understand you correctly that the main reason why you want to call ICA from SCoT is because you cannot set a seed for EEGLAB's runica? Did you try to set the seed explicitly before calling it?

cbrnr commented 9 years ago

I just saw that the state is set to a value depending on the current time (line 812 in runica.m):

rand('state',sum(100*clock));

It could be worth a try to change this line to

rand('state', 0);

Maybe this will then produce the same results when you repeatedly call this function.

mbillingr commented 9 years ago

@pntfx4

when you get a chance, can you email me?

I don't have your email address.

@cle1109

It could be worth a try to change this line to

That won't work for the compiled binica, though.

pntfx4 commented 9 years ago

@cle1109 The biggest reason is that my lab is mostly PCs with a small number of Macs. I have been unable to find a compiled ICA binary for PC, so data processing slows down having to do ICA correction on the Macs. Python has the benefit of being platform independent. There is also the potential benefit that Python has in terms of its ability to better utilize the capabilities of the computer processor relative to the binary function (supposedly).

cbrnr commented 9 years ago

@pntfx4 I agree that Python is much better than MATLAB :-). If you are OK to jump through a lot of hoops just to use SCoT's ICA algorithm (which in fact we have borrowed from MNE), then great! However, it might be easier to find something that works with MATLAB/EEGLAB out of the box (as far as I understood it, you're using EEGLAB as your main processing tool). For instance, SIFT nicely works with EEGLAB and does a great job in estimating various connectivity measures.

Concerning the ICA algorithm, maybe CUDAICA works for your setting. It might also be worth poking around in the source code of binica, but I know that this is probably not an option (I tried to get it to compile once without success).

Finally, you could also install Linux on your PCs :-).

In any case, I'm just trying to list alternatives that might work better for you, but if you're fine using SCoT I'm happy to provide support (it's just that I think SCoT works best in a pure Python environment, but it's good to see people using it in different settings).