Closed cbrnr closed 6 years ago
Take a look at #5010. Also here is a function that I use for transparency masking: https://github.com/mmagnuski/mypy/blob/master/mypy/viz.py#L591 It uses a sightly different approach to Jona's PR - instead of plotting masked arrays with alpha it plots RGBA mask on top of tfr. This allows for finer control of color and transparency of every pixel in the mask, but has some disadvantages in interactive mode (because the RGBA mask is on top when you move the cursor around you are shown the mask values in the corner, not the tfr values). It also allows for valid outlines (normal matplotlib contours can be ugly in dichotomous cases):
Oh, you are asking about something else actually:
However, my modified example doesn't look right. The masked regions do not fit the TFR at all, it seems like the non-significant patches are arbitrarily placed on the TFR image
Make sure that your extents are correct. BTW the extents done by tfr may not be super-correct. Take a look at my heatmap
function to see how they should behave. I'll try to check and fix the extents if it is needed later this week.
Thanks for the pointers to #5010 and your function, good to see that other people seem to need this kind of functionality too.
Regarding the extents, are you sure this could be the problem? I'm actually getting the array from the image, add a mask, and then put the array back into the image:
arr = ax[i].collections[0].get_array().reshape(shape)
arr[mask] = masked
ax[i].collections[0].set_array(arr.flatten())
The extent
that was set before should still be valid, shouldn't it?
Regarding the extents, are you sure this could be the problem? I'm actually getting the array from the image, add a mask, and then put the array back into the image
Yes, I looked more closely at how you plot it after posting and I don't think extents should be a problem here. However I find it hard to understand where the masks should actually be. Could you create a shorter and simple example with small arrays that captures the problem?
I think the problem is that I can't run the cluster test on tfr.data
, because with tfr.plot
I'm actually plotting the average TFR but rescaled (baseline-corrected) with the 'percent'
method.
I want the mask on each of the TFR images separately (that is, for each channel). I want to show only TFR values significantly different from 0 (and that's why I need to look at baseline-corrected values).
Not sure if mne.baseline.rescale
could work here.
What if you first tfr.apply_baseline
and then do not use baselineing in plotting? Then you would use the same data in cluster stats and plotting.
Thanks @mmagnuski, this seems to work. This is going in the right direction, but the maps still don't look as they should. Obviously, the clusters found by the algorithm heavily depend on the threshold. I've tried a couple of values, but none seems to produce what I want (keep clusters of strongest activation). Here's an example with threshold=1.5
(and seed=5
):
If you look at the unmasked images (above, almost unmasked), this is not quite what I'd expect. Is there anything else I can tune to make this work better? Is TFCE an option? If so, how do I set the threshold
arg for TFCE?
Finally, could it be problematic that ERD values (negative values) are bounded by -1, whereas ERS values (positive values) are unbounded? Would the presence of large ERS values confound the detection of significant ERD clusters? If so, is there a way around that?
Positive and negative clusters are clustered separately (and have separate null distributions) but the bound may make a difference - see for example this paper:
Highlights
- Percentage baseline correction leads to an ERD underestimation and ERS overestimation.
- Subtraction baseline correction does not introduce any bias in ERD/ERS estimation.
I've never actually used TFCE in practice. I always use threshold that is => 2. for t test values and never fiddle with it, but keep it fixed. In the fieldtrip times I also used min nb chan == 2 for 64 channel caps. In your case you would need 3d clustering, I think - so that adjacent channels would get clustered together. This is not implemented in mne currently for tfr.
Positive and negative clusters are clustered separately (and have separate null distributions)
OK, good.
but the bound may make a difference - see for example this paper:
Thanks for the link, this might be worth reading. From a quick first scan I don't think that the percentage baseline correction is a problem in my case because I'm not using single-trial baselines. Or at least I hope that's not what apply_baseline(method='percent')
does - I thought it would use the average of the baseline power across all epochs in the calculation. Can you confirm that this is indeed the case?
I've never actually used TFCE in practice.
I wonder if anyone has actually ever used it with EEG/MEG data 😄.
I always use threshold that is => 2. for t test values and never fiddle with it, but keep it fixed. In the fieldtrip times I also used min nb chan == 2 for 64 channel caps.
Yeah, it would be nice to keep a fixed value. I can't set the minimum number of channels in MNE though.
In your case you would need 3d clustering, I think - so that adjacent channels would get clustered together. This is not implemented in mne currently for tfr.
No, I'm performing the clustering separately for each channel, so 2D (f x t) is fine.
Also, any idea how I could make the non-significant values transparent (e.g. with an alpha around 0.2) instead of just white? I really like how you and @jona-sassenhagen visualize this. Currently, I use a masked array, and pcolormesh
seems to map masked values to white. Is this somehow possible by modifying the existing figure/axes? Or do I have to modify the function and call pcolormesh
twice as in #5010?
imshow takes (x, y, R, G, B, A) IIRC, so you could set it all manually.
The short route would be helping me get my PR merged and then extending it to TFR :)
imshow takes (x, y, R, G, B, A) IIRC, so you could set it all manually
You're right, but for whatever reason AverageTFR.plot
uses pcolormesh
, which doesn't support RGBA.
The short route would be helping me get my PR merged and then extending it to TFR :)
Sounds good, let me know what I can do!
Just review it I guess.
Positive and negative clusters are clustered separately (and have separate null distributions)
IIRC they do not have separate null distributions. If you want this behavior you should do two clustering steps, one with tail=1
and the other with tail=-1
.
You're right, but for whatever reason AverageTFR.plot uses pcolormesh, which doesn't support RGBA.
imshow
is generally less flexible than pcolormesh
. For example imshow
must have uniform "pixel" sizes, which isn't good when frequencies are not guaranteed to be evenly linearly spaced.
I've tried a couple of values, but none seems to produce what I want (keep clusters of strongest activation)
If you take a look at a thresholded version of the statistical map (not the grand-average) produced by whatever stat_fun
you are using, you should see that whatever the cluster test returns are the "biggest" clusters. This will not necessarily match the "biggest" grand-average peaks, depending on the variabilitiy in your data (because of the division by standard deviation in the t-test).
IIRC they do not have separate null distributions. If you want this behavior you should do two clustering steps, one with
tail=1
and the other withtail=-1
.
I'll try that.
imshow
is generally less flexible thanpcolormesh
. For exampleimshow
must have uniform "pixel" sizes, which isn't good when frequencies are not guaranteed to be evenly linearly spaced.
Yeah, this makes sense especially for wavelet-type frequency estimation, but I thought that MNE always uses linearly spaced frequency values. Anyway, I already assumed that pcolormesh
is there for a reason :smile:.
If you take a look at a thresholded version of the statistical map (not the grand-average) produced by whatever
stat_fun
you are using, you should see that whatever the cluster test returns are the "biggest" clusters. This will not necessarily match the "biggest" grand-average peaks, depending on the variabilitiy in your data (because of the division by standard deviation in the t-test).
Right. It's not actually that bad, the significant clusters are approximately the largest activations in the grand average image - I just expected to see more clusters. Maybe this will improve when I handle positive and negative clusters separately as you suggest.
Maybe this will improve when I handle positive and negative clusters separately as you suggest.
Also you can use e.g. step_down_p=0.05
. It removes "significant clusters" (don't ever say this in a publication) that could be pushing the null distribution larger than it should be.
(...And by that I mean don't use the phrase "significant clusters", even though that's how people tend to colloquially refer to them. The step_down_p
stuff is fine for publication.)
@cbrnr
You're right, but for whatever reason AverageTFR.plot uses pcolormesh, which doesn't support RGBA.
@larsoner
imshow is generally less flexible than pcolormesh. For example imshow must have uniform "pixel" sizes, which isn't good when frequencies are not guaranteed to be evenly linearly spaced.
I was thinking about this from time to time since we decided to use pcolormesh
. Due to some inconvenient aspects of pcolormesh (as @cbrnr points out) I am not using it in my plots and I'm always using imshow. It might be easier if we eliminate all the inference magic from tfr plotting (linear vs log space) and always use linear imshow by default and use pcolormesh only if the user asks for that. Currently even if I have log space freqs I still use imshow, I just relabel the y ticks accordingly.
Even if I have log-spaced frequencies, I might want to view them on a linear scale (or have linearly spaced freqs viewed on a log scale). This is not possible with imshow
, right? So we would be restricted to ever showing log-spaced frequencies on a log scale, and linear-spaced frequencies on a linear scale? (And either never showing arbitrarily spaced frequencies, or doing so incorrectly?)
Unfortunately you are correct, I fear. @cbrnr - can you use the approach taken by Jona in his transparency PR with pcolormesh
? If so then there would be much less reason to drop pcolormesh as the default (and the flexibility that you mention definitelly counts).
Using two separate clusterings with tail=-1
and tail=1
as well as step_down_p=0.05
gives decent results (but it would be good to check with more data):
@mmagnuski I'll try to do the masking with @jona-sassenhagen's approach.
I added mask
and alpha
arguments. This works OK, but the resulting image now contains white bands (nicely visible especially in the color bar):
Or maybe it's only the color bar - why is it transparent anyway?
Colorbar problem solved, I took the wrong axes object (because now there are two, the transparent and the opaque).
All changes are summarized in #5036.
@cbrnr Glad to hear it works well!
I'm trying to plot a TF image (more specifically an ERDS map) with a significance mask. In other words, I would like to plot only significant t/f values, whereas not significant values should be white.
Here's the example code I'm working with: https://martinos.org/mne/stable/auto_examples/time_frequency/plot_time_frequency_erds.html
I need the individual epochs to compute the significance, so I add the argument
average=False
totfr_multitaper
. I create a new object containing the average TFR withaverage_tfr = power.average()
.I renamed
power
totfr
for consistency. Here is the modified code starting on line 103 (everything else above doesn't change except for two additional imports:This code is inspired by https://martinos.org/mne/stable/auto_tutorials/plot_stats_cluster_1samp_test_time_frequency.html. However, my modified example doesn't look right. The masked regions do not fit the TFR at all, it seems like the non-significant patches are arbitrarily placed on the TFR image (I've used
threshold=5
in the cluster permutation test):Any idea what I'm missing here? Could this be related to the fact that I'm plotting the TFR image with
baseline=[-1, 0], mode="percent"
, which might not be reflected inaverage_tfr.data
? If so, how can I run a significance test on the correct data? Or is there something wrong with how I'm performing the cluster permutation test?