How to look at clustering in time, frequency and space

pokor076 commented 4 years ago

Hi there, I'm enjoying playing around with the toolkit and love the idea of the TFCE approach. Is it possible to perform a t-test that identifies significant clusters in time, frequency and space? I have tried running two matrices (condition1 and condition2) composed of 128 electrodes x 129 frequency bins x 513 time bins through the ept_TFCE function (I changed the ft_flag to 1) . The function ran, but the output didn't have any spatial information and I couldn't get the results to load (I know I selected a valid e_loc file when running the t-test because I was able to get spatial information when reading in subjxtimexelec matrices). Essentially, I would like contrast two grand average time-frequency matrices to identify spatio-temporo-frequency clusters of interest. Is this possible using this toolkit and if so do you have any tips for where I might be going astray?

Thanks! Victor

Mensen commented 4 years ago

In principle, you shouldn't need to change anything and your matrix will just have the other dimension in it.

The ft_flag is for if you don't have any channel information at all.

pokor076 commented 4 years ago

Ahhh that makes sense. Thanks!

On Wed, Apr 1, 2020 at 11:23 AM Armand Mensen notifications@github.com wrote:

In principle, you shouldn't need to change anything and your matrix will just have the other dimension in it.

The ft_flag is for if you don't have any channel information at all.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Mensen/ept_TFCE-matlab/issues/28#issuecomment-607351052, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMXYDSOFW4TXWNINI5QDR4DRKNTAVANCNFSM4LXD6UOQ .

pokor076 commented 4 years ago

Ok so I was able to get everything to run smoothly (I believe). The result viewer is super handy and intuitive! Now my issue is that I'm not seeing any compelling clusters in space, frequency or time (screenshot attached). I realize that it's possible that there are just no strong clusters that modulate between my conditions, but I wanted to check with you to ask: 1.) is this a fairly common result of TFCE on TF EEG data? 2.) do you have any recommendations for approaches that better leverage the power of TFCE for TF EEG data (in contrast to the approach I described in my first post)?

Thanks so much for this tool! My lab produces a lot of TF EEG data and a perennial issue we face is choosing electrodes, time points and frequency bins of interest for exploratory analyses.

Mensen commented 4 years ago

Hmmm... I wouldn't say that is a typical results actually. Although I don't do a lot of channel x time x frequency analysis myself... so difficult for me to comment on "typical".

I'm glad you think the result viewer is still handy... I don't think I've touched the code for almost 5 years at this point and would probably do a lot of things differently (if I could ever find the time).

I'd go as far as to say there may be some mistake in the analysis... however, you do get "some" results.

If you plot the individual TF maps for a single channel of interest, does it seem like there should be a difference there?

If you then look at the "Results" variable in the file, and find the T_obs (observed T-value) for a few specific points of interest, are those T-values very high?

While the TFCE method does "generally" reduce the significance compared to analysing any one single selected point (given the multiple comparisons issue)... this is not always the case since if those points have strong support from their neighbours (in channel, time, and frequency), it could in fact increase significance (lower p-value), then the single-point analysis.

Let's first make sure there are no bugs somewhere in that code, and then you can be more confident in those results (or lack thereof).

pokor076 commented 4 years ago

Hey thanks for the response! Plotting TF maps for a single channel of interest definitely makes it seem like there are condition differences (example subtraction surfaces attached). Similarly, when I plot the Results.Obs (which I'm guessing is the raw t values variable) for a single channel, the surface looks fairly believable (attached). However, the corresponding p values to the believable raw t values are all 1... This makes me think that there are two possible explanations: 1. my data has a lack of spatial, frequency and/or time clusters of difference between conditions 2. there is an issue with how the cluster enhancement code is dealing with the three dimensions. Let me know what you think. Like I said I'm super motivated to try to get the details of this method ironed out because it could be a real game changer!

Mensen commented 4 years ago

Could you restrict the analysis to a single time point, or frequency bin, or even channel of particular interest and see if the code still generates only 1's for the p-value?

pokor076 commented 4 years ago

Yep so I tried another t-test contrasting two conditions for 128 elecs x 1 frequency bin x 513 time bins and again the original t values seem reasonable, but the p values are dominated by 1s again (histogram attached).

This may or may not be related, but I've noticed that if I choose to run an independent t-test the code takes about 2-4 min to run, but when I choose to run the dependent t-test on the same data the code takes around 8 hours to run. Does that seem right? The independent t-test seems to be coming up with more significant p-values as well (I would've thought the dependent t-test would be more sensitive). I tried running an independent t-test on my original grand averages (elecsxfreqxtime), but matlab gives me the attached error.

Thanks again!

Untitled document.pdf

pokor076 commented 4 years ago

Hi just wanted to check in about this issue. Would it be helpful if I DM'd you some example files?

Mensen commented 4 years ago

Sorry for late/no responses. Unfortunately I'm just not sure what could be going on here. This project (and even research field) is no longer related to my daily work, so its very difficult to find the time to go into much detail here.

Its certainly not normal to have such a long run-time for the dependent tests... the calculation itself is actually faster to complete. However the majority of the time is finding the clusters of results at all the different thresholds set (the approximation to "threshold-free"), so if a lot of different clusters of results are in the data, this can take more time. However, this should still not take any longer than a few minutes, so with a run-time of 8 hours something is certainly not write.

If you're willing to share some anonymised summary files with me, perhaps I can have a look under the hood at each stage of the calculation and give you some more info. You can email me, or share with research.mensen@gmail.com

Mensen / ept_TFCE-matlab

How to look at clustering in time, frequency and space #28