Open Div12345 opened 2 years ago
I've added an example comparing spectral_connectivity_epochs and time using simulated data and coherence here. Haven't yet cleared out some of the testing code.
TODO -
@Div12345 apologies for the delay. It is perhaps worth tabling this for now and actually it seems that there is some discrepancy as to if the implementation I ported over from Frites for time-resolved spectral_connectivity is correct or not.
Would you be up for perhaps taking a look at https://github.com/mne-tools/mne-connectivity/issues/70#issuecomment-1012241671 and comparing the implementation there vs what we have currently? I think @avoide raised a good point in https://github.com/mne-tools/mne-connectivity/issues/17#issuecomment-1011242482, which makes me suspect that either Frites, or I implemented time-resolved spectral_connectivity incorrectly.
Thankfully, @avoide has a working example, so if it makes sense, we can replace his implementation with ours in a PR. We just have to fashion it similarly to the existing mne-connectivity
API.
Once this is all sorted out, it makes sense to revisit the example started here. I can perhaps get to it in a few weeks otw.
Hey @adam2392 , no issues!
I can check once again.. I have been having some doubts about the indices parameter as well. I tried using specified channel indices to calculate the connectivity on some real data, but it seems to be giving some issues. Something about the estimated n_nodes being a certain number but the calculated being n_indices (the number of connections I wanted it to calculate).
Is this hardcoding here of the indices='symmetric' correct?
I'll raise another issue with all the exact errors I get and the debugging I did as well.
There are some simple implementations of the PLV and the coh over on the mne-features library as well - PLV function
@adam2392 While I didn't look specifically into the exact working, I did further testing with the methods. These are my testings for the single epoch over time methods -
Coherence tests -
rng = np.random.RandomState(0)
x1 = rng.rand(1, 1, n_samples)
x2 = np.sin(2 * np.pi * 10 * time)
x3 = np.sin(2 * np.pi * 55 * time) # some non-multiple
data = np.zeros((n_epochs, n_channels, n_samples))
data[0, 0, :] = x1
data[0, 1, :] = x2
data[0, 2, :] = x2 # x3
# Different values in different epochs
data[1, 0, :] = x1
data[1, 1, :] = x2
data[1, 2, :] = x3
Epoch 1 coherence -
Epoch 2 coherence -
I further did tests for PLV -
This is for
rng = np.random.RandomState(42)
x1 = rng.rand(1, 1, n_samples)
x2 = np.sin(2 * np.pi * 10 * time + 0)
x3 = np.sin(2 * np.pi * 10 * time + 0.05 * np.pi) # same for without phase diff also
This is for
rng = np.random.RandomState(42)
x1 = rng.rand(1, 1, n_samples)
x2 = np.sin(2 * np.pi * 10 * time + 0)
x3 = np.sin(2 * np.pi * 100 * time )
Seems to me like I'm getting the expected results.
This is using avoides implementation?
No, this is using what is integrated from frites i.e. spectral_connectivity_time
Hey @adam2392 , had a chance to go through that? Or let me know if you think it's still better to check the code once.. @avoide could you have a look too maybe.. I'll check once again about the issue you raised as well..
Hi @Div12345, I tried your code and also looked at your images. Is it correctly understood that in your 10 and 10Hz with Phase diff plot, x1-x1, x2-x2, x2-x3 and x3-x3 show the PLV = 1 as expected, but for some reason the x1-x2 and x1-x3 are the green line, which overlaps with the yellow one? If that is so, then I find this PLV with random generated data extremely high.
I used:
# Simulating data
rng = np.random.RandomState(0)
x1 = rng.rand(1, 1, n_samples)
x2 = np.sin(2*np.pi*10*time)
x3 = np.sin(2*np.pi*10*time+0.20)
data = np.zeros((n_epochs,n_channels,n_samples))
data[0,0,:] = x1
data[0,1,:] = x2
data[0,2,:] = x3
# # Same value for each channel in all the epochs
# for i in range(n_epochs):
# data[i] = data[0]
# Different values in different epochs
data[1,0,:] = x1
data[1,1,:] = x2
data[1,2,:] = x3
And on average the PLV over time was 0.82 between a 10Hz sinus wave and random noise! This seems a bit too high.
One problem I mentioned in my latest comment in #17 is that randomly generated signals showed abnormally high PLV, which can also be observed using your code:
# Simulating data
rng = np.random.RandomState(0)
x1 = rng.rand(1, 1, n_samples)
x2 = rng.rand(1, 1, n_samples)
x3 = rng.rand(1, 1, n_samples)
Here on average across the epoch the PLV matrix is: [0.98156652 0.696885 0.77922573] [0.696885 0.98156652 0.7089454 ] [0.77922573 0.7089454 0.98156652] Thus around 0.7 to 0.78! For randomly generated data we would expect 0, although of course due to the finite nature and wavelets/filters etc we would expect some phase locking due to chance but not this high I think.
Using my PLV implementation I got: [0. , 0. , 0. ], [0.25187146, 0. , 0. ], [0.1383034 , 0.2419822 , 0. ]] Which seemed more reasonable. Bear in mind this was only one simulation, in #17 I repeated it 100 times and got:
PLV using spectral_connectivity_time over 100 runs with randomly generated data [[0.98156652 0.64840676 0.65101158] [0.64840676 0.98156652 0.64902626] [0.65101158 0.64902626 0.98156652]] PLV using my implementation over 100 runs with randomly generated data [[0. 0. 0. ] [0.14877284 0. 0. ] [0.15452367 0.15971872 0. ]] Which raised some flags concerning spectral_connectivity_time.
I'm inclined to believe there are issues here simply based off of the simulation that @avoide proposes. It seems weird to me that the spectral_connectivity_time
does not work for such a simple simulation, so I'm concerned perhaps the Frites implementation either has a bug, or there are some caveats to how it is used. I'm 95% sure my implementation is a correct copy as I unit test against their implementation.
Some potential ways forward are:
spectral_connectivity_time
with @Avoide 's implementation and move forward with that one.Pinging @EtienneCmb here for some thoughts.
Thanks @adam2392, I'll try to reproduce @Avoide results tomorrow.
Thanks @adam2392, I'll try to reproduce @Avoide results tomorrow.
Hi @EtienneCmb any luck on this?
Hi @adam2392 and @Avoide,
I was able to reproduce @Avoide results, with high PLV values even for random data. I also tested the code for different temporal smoothing (default is using morlet's wavelets with a fixed 7 cycles) and PLV baseline values are decreasing with longer kernels (same for coherence). I guess that the longer the data, the better the estimation.
import numpy as np
import xarray as xr
from frites.conn import conn_spec
import matplotlib.pyplot as plt
sfreq = 64.
n_samples = 5000
times = np.arange(n_samples) / sfreq
freqs = np.linspace(2., 20, 50)
# generate random data
rng = np.random.RandomState(0)
x = xr.DataArray(
rng.rand(1, 3, n_samples), dims=('trials', 'roi', 'times'),
coords=([0], ['x1', 'x2', 'x3'], times)
)
# compute plv for different temporal smoothings
plv = {}
for sm_times in [.5, 1., 2., 5., 10., 20., 30.]:
plv[sm_times] = conn_spec(
x, roi='roi', times='times', freqs=freqs, metric='plv', sfreq=sfreq,
n_jobs=1, sm_times=sm_times, verbose=False
).squeeze().mean(('times', 'roi'))
plv = xr.Dataset(plv).to_array('sm_times')
# plot the results
plt.figure(figsize=(12, 8))
plv.plot(x='freqs', hue='sm_times')
plt.xlabel('Frequencies'), plt.ylabel('PLV'), plt.ylim(0, 1)
plt.title('PLV for different sm_times', fontweight='bold');
If you add a 10hz pure sine to the random data, longer kernels provide a better ratio between the PLV at 10Hz and the noise :
# generate random data
rng = np.random.RandomState(0)
x = xr.DataArray(
rng.rand(1, 3, n_samples), dims=('trials', 'roi', 'times'),
coords=([0], ['x1', 'x2', 'x3'], times)
)
x.data += .3 * np.sin(2 * np.pi * 10 * times).reshape(1, 1, -1)
There are also differences between morlet and multi-tapers. With the latter, I got better results
# compute plv for different temporal smoothings
plv = {}
for sm_times in [.5, 1., 2., 5., 10., 20., 30.]:
plv[sm_times] = conn_spec(
x, roi='roi', times='times', freqs=freqs, metric='plv', sfreq=sfreq,
n_jobs=1, sm_times=sm_times, verbose=False, mode='multitaper',
n_cycles=freqs / 2, mt_bandwidth=10
).squeeze().mean(('times', 'roi'))
plv = xr.Dataset(plv).to_array('sm_times')
I don't know if it is an implementation issue or if it is an inherent limitation of measuring the COH / PLV over time. @ViniciusLima94, any opinion on this ?
Hello @EtienneCmb , @adam2392 and @Avoide .
Concerning the bias of the metric, you can refer to Lachaux et. al. 2002
In summary, it depends on the number of cycles and on the smoothing window (see image above). Note that it is possible to make the bias independent of the frequency by defining $n_{cycles} = \alpha * freqs$.
I have a few examples in the following notebook: https://github.com/ViniciusLima94/GrayData-Analysis/blob/master/notebooks/3.0.9.%20Lachaux%20wavelet%20coherence.ipynb
Notably, below I show the bias computed analytically with the coherence computed for white noise signals as a function of the temporal smoothing kernel ($w_t$ in seconds):
Also, for $n_cycles = constant$ you can see that the bias is frequency dependent:
While for $n_{cycles} = \alpha * freqs$ no:
Let me know if I can further help.
Also, @Avoide correct me if I'm wrong, but your implementation is a single-trial & over-time PLV implementation. The one in Frites is also a single-trial implementation, computed over time but using kernel smoothing to retrieve the "temporal" dynamic of the PLV (modulo kernel length). I guess it would be like using your implementation on sliding windows. This should be take into consideration when comparing the results of both methods imo.
Also, @Avoide correct me if I'm wrong, but your implementation is a single-trial & over-time PLV implementation. The one in Frites is also a single-trial implementation, computed over time but using kernel smoothing to retrieve the "temporal" dynamic of the PLV (modulo kernel length). I guess it would be like using your implementation on sliding windows. This should be take into consideration when comparing the results of both methods imo.
Yes that is correct. The implementation I posted is based on the epochs, thus it correspond to using non-overlapping windows.
Sorry I've been swamped, so lost track of this.
Is there a summary then of the bug that needs to be fixed? Is @Avoide 's implementation correct, or the Frites version, or both?
This is now addressed by #104, if the contributors of this discussion would like to have a look and comment.
@EtienneCmb I believe the reason why the PLV values appear seemingly too large in the example you posted above is an error in the Frites implementation of PLV. I might be mistaken, so please correct me if there's a misconception in my thinking instead.
This is your PLV implementation in Frites:
def pairwise_plv(w_x, w_y):
# computes the plv
s_xy = w[:, w_y, :, :] * np.conj(w[:, w_x, :, :])
# complex exponential of phase differences
exp_dphi = s_xy / np.abs(s_xy)
# smooth e^(-i*\delta\phi)
exp_dphi = _smooth_spectra(exp_dphi, kernel)
# computes plv
out = np.abs(exp_dphi)
# mean inside frequency sliding window (if needed)
if isinstance(foi_idx, np.ndarray):
return _foi_average(out, foi_idx)
else:
return out
In the above example you provided, the output array is averaged over the time axis to obtain a single scalar PLV value for each frequency point.
In mathematical notation, this corresponds to: $$PLV{x, y}=\frac{1}{T}\sum{t=1}^T|\tilde{S}_{xy}|$$
where $\tilde{S}_{xy}$ is the normalized and smoothed cross spectrum of signals $x$ and $y$.
However, the definition of time-resolved PLV is: $$PLV{x,y}=\frac{1}{T}|\sum{t=1}^T\tilde{S}_{xy}|$$
Taking the absolute value before the average leads to too large values, since in general $|a+b| \le |a|+|b|$.
Furthermore, let's look at this part of the code (edited the comments for readability):
exp_dphi = s_xy / np.abs(s_xy) # complex exponential of phase differences
exp_dphi = _smooth_spectra(exp_dphi, kernel) # smooth e^(-i*\delta\phi)
out = np.abs(exp_dphi) # computes plv
Now again, I might be mistaken, but to me it seems that we have exp_dphi = s_xy / np.abs(s_xy)
, so exp_dphi
is now an array full of normalized complex numbers. If we were to skip the smoothing step, we would have out = np.abs(exp_dphi)
, so out
is now an array that contains the absolute values of normalized complex numbers.
In other words, without the smoothing step, out
is now an array full of ones. I don't know if this is by design, but in any case something of note.
To verify this, I quickly commented out the smoothing, and indeed:
Computing the average first and then taking the absolute value leads to much more reasonable PLV scores for uncoupled signals. Code:
def pairwise_plv(w_x, w_y):
# computes the plv
s_xy = w[:, w_y, :, :] * np.conj(w[:, w_x, :, :])
# complex exponential of phase differences
exp_dphi = s_xy / np.abs(s_xy)
# smooth e^(-i*\delta\phi)
exp_dphi = _smooth_spectra(exp_dphi, kernel)
# computes plv
out = np.abs(np.mean(exp_dphi, axis=-1, keepdims=True))
# mean inside frequency sliding window (if needed)
if isinstance(foi_idx, np.ndarray):
return _foi_average(out, foi_idx)
else:
return out
Random data: 10Hz coupling:
I believe the reason that we don't see a PLV value of exactly one for 10Hz is that Frites takes the average of the spectra produced by mne.time_frequency.tfr_array_multitaper
. Instead of taking the average, it seems that connectivity scores should be computed first and then averaged. This is discussed in #84 and solved in #104.
I hope this helps. Let me know if I'm misunderstanding something. Also a note: The latest MNE-Python version 1.2.0 breaks Frites due to the removal of _get_args.
Hi @ruuskas,
Thank you for the clear explanations, I'll also fix the issue in Frites. @ViniciusLima94, if you've time, could you take a look at it also?
The latest MNE-Python version 1.2.0 breaks Frites due to the removal of _get_args.
Yes, thank you, I fixed it recently.
Hello @ruuskas,
Indeed, when estimating PLV at single-trial level, removing the smoothing leads to 1 valued PLV for all frequency and time points.
The smoothing in this case should be equivalent to using small integrating windows. This procedure introduces a bias in the metric (for this I refer Lachaux et. al. 2002, I also sent a reply with an explanation of this above), leading to high values. Taking the average over time also eliminates this bias, however, we do not get a time-resolved metric.
What I usually do, is create surrogate measurements to subtract from the original one (for instance take the value of the 95% over the surrogate distribution).
For instance bellow I show the trial averaged TF map of PLV for the corrected and uncorrected values:
Hi @ViniciusLima94.
Thanks for the explanation. What is the interpretation of the trial averaged corrected PLV you show in the left figure? Would this type of estimation be used for resting-state or stimulus data? My understanding would be that looking at the time-resolved PLV produced by averaging over trials would only be sensible if the trials are generated in a meaningful way (for example, stimulus onsets).
Could you clarify the process of going from the uncorrected to the corrected PLV? I understood that you would create surrogates of the original time-course, then compute the PLVs, and only accept those values that are higher than the 95th percentile of the surrogate distribution. Is that correct?
Hello @ruuskas,
In the example above I sent the trial averaging because since it is an example model the trials have all the same structure. Usually, we are interested in single-trial estimation, so we don't do this averaging.
You described it right, usually, I do it like this. Depending on how you do you can generate chance levels for each (time, frequency) or (time, frequency, trial) point.
@ViniciusLima94 I understand. What sort of data do you normally use? I'm just curious because when using real data and computing all-to-all connectivity, the computational time is high enough for a single estimation of connectivity. In that scenario, computing e.g. 1000 surrogates would not be practical.
@ruuskas you're right for my case it is not practical at all to have many surrogates. I do a simplification, and assume that thresholds are the same over trials, so what I do is to take the 95% of the distribution across trials, which yields a threshold per (time, frequency) point. Like in the cartoon below:
My data is recorded during a working memory task, I can show you a few single-trial coherence (already corrected) examples:
@ruuskas do you know the impact of computing connectivity on average PSD over tapers vs. averaging the connectivity metric?
@ruuskas do you know the impact of computing connectivity on average PSD over tapers vs. averaging the connectivity metric?
@EtienneCmb it's not a huge effect, unfortunately I fear I don't have an explicit comparison saved.
@ViniciusLima94 I see, so it seems that we're trying to tackle a different problem on the conceptual level. My understanding has been that the function spectral_connectivity_time
is trying to compute connectivity over time, similar to the way spectral_connectivity_epochs
computes connectivity over trials. So this is quite different from the spectrotemporal connectivity in your case.
Did I get it right that in the example you posted no surrogate data was generated, but rather the trials themselves were used as "surrogates"?
Hi, all thanks to @ruuskas contributions in https://github.com/mne-tools/mne-connectivity/pull/104, I think the underlying bug/incongruiences that were discovered in this and related issues have been resolved.
@Div12345 not sure if you're still interested because I realize a significant amount of time has passed, but I think it is possible to proceed with an example demonstrating the differences in usage of the two spectral connectivity methods.
PR Description
Closes #70 . Example showing difference between spectral_connectivity_epochs and spectral_connectivity_time
Merge checklist
Maintainer, please confirm the following before merging: