Questions about the code and functional connectivity in general

kauttoj commented 7 years ago

Firstly, thank you for this very useful code. I have few questions about the code and one more general functional connectivity question.

As we know, real neural data is noisy and we have no 'ground truths' when trying to estimate connectivity between neurons. With neuron-level data we cannot compare connections between animals (as people do in fMRI/MEG/EEG) and typically must work with one experimental dataset. The way I see it, only ways to convince ourselves that our connectivity estimations are valid: (1) use best available estimation method, which at the moment is your 'lv-glasso' (2) check the cross-validated correlation matrices and see if they agree over folds. Any real strong connections and graph metrics (e.g., major hubs) should be stable over folds. What is your opinion on this? Is there something else one can do to test if connections are real?

Here are some questions about the code:

(1) In your code, you treat each time bin independently (i.e., they are like conditions). Why so? At least for the dynamical stimuli (e.g., movies) with time-dependency, it seems more useful to compute variance over all bins to track timecourses. For example, if we consider two example neurons and their binned responses (e.g., ~200ms bins) for a single repeat of a movie clip:

neuron 1: [0,0,3,10,4,0,0,0,9,2,0,0] neuron 2: [0,0,2,15,6,1,0,0,5,4,0,0]

There is strong correlation between the two and - depending on deconvolution method - many of the bins are naturally empty. However, the downside is that if there are bins with bad data, one must drop all bins for that trial.

(2) In hyperparameter optimization, what is the rationale of using this particular loss function for correlation matrices: L = (trace(Rp/R)+cove.logDet(R))/p + sum(log(V(:)).*N(:)); Is this more stable than simply computing correlation or squared error between Rp and R, or similarity of the thresholded networks (e.g., top 5% of edges)?

(3) As I am only interested in pair-wise connections between neurons, hence it seems that with 'lv-glasso' I only need to concentrate on this matrix

     partial_corr_matrix = -corrcov(extras.S);

Is there any reason to instead use the actual (non-sparse) correlation matrix?

(4) Is the version in this repo the latest one? Have you made or considered any improvements/changes to the code or methods since the release?

kauttoj commented 7 years ago

Couple of more details/thoughts.

For question (2): I am mostly interested in the third term involving variance logarithm. First two terms appear to be standard parts of loss function (for e.g., scikit-learn's GraphLasso), although one typically uses covariance matrix instead of correlation.

For question (3): After analyzing more data, it seems that my sparse "extras.S" is often diagonal matrix without any off-diagonal elements. My optimal hyperparameters also tend to be very small (~10e-3 or ~10e-4). Is this typical or sign of bad data?