0todd0000 / spm1dmatlab

One-Dimensional Statistical Parametric Mapping in Matlab.
GNU General Public License v3.0
28 stars 13 forks source link

zstar value not changing between comparisons #42

Closed gillianweir closed 7 years ago

gillianweir commented 7 years ago

Hi Todd,

I am running some paired t-tests on some data. However each comparison I run (n=3) reproduces the exact same zstar value. When I run the example code it works fine. I just looked at the spm1d.frt1d.t.m code and spm1d.frt1d.RFTCalculator.m and noted that this function operates for number of nodes at 101, however the data I have is normalised to 100 points. Would this be causing the issue? Or are you familiar with what would cause this?

Many thanks in advance.

Gillian

0todd0000 commented 7 years ago

Hi Gillian,

A non-changing critical threshold can sometimes be produced if the data are quite rough. If your data are not rough (FWHM < 5%) then please ignore the rest of this reply, and please send some additional details.

First, the somewhat easier bit: spm1d.rft1d.RFTCalculator.m and all other probability calculations do indeed set the default number-of-nodes to be 101 nodes, but if your data has a different number of nodes it should be recognized automatically. For example:

>> y = randn(8,101);
>> spm = spm1d.stats.ttest(y);
>> disp( spm )

SPM{t}
         z: [1×101 double]
        df: [1 7]
      fwhm: 2.0770
    resels: [1 48.1472]

As long as the test statistic "z" has the correct size (in this case: "1 x 101"), then the probability calculations should be fine. Changing the array size to 851 nodes, for example, should be reflected in the test statistic as indicated below.

>> y = randn(8,851);
>> spm = spm1d.stats.ttest(y);
>> disp( spm )

SPM{t}
         z: [1×851 double]
        df: [1 7]
      fwhm: 2.1695
    resels: [1 391.7956]

Notice that the test statistic size has changed.

{I'll start another reply below regarding roughness and how it relates to your main question...}

0todd0000 commented 7 years ago

{Now on to the main question...}

If the data are quite rough (FWHM < 5) the critical thresholds might be identical even when using different datasets for statistical tests. The reason is that spm1d by default uses a Bonferroni correction when the random field theory (RFT) threshold becomes too conservative.

To see what I mean it's sufficient to consider only the function which computes the critical threshold: spm1d.rft1d.t.isf. This is the RFT inverse survival function ("isf") for the 1D t statistic, and here's how it works:

>> a      = 0.05;   %alpha (Type I error rate)
>> df     = 9;      %degrees of freedom
>> nNodes = 101;    %number of continuum nodes
>> fwhm   = 10;     %smoothness (full-width at half-maximum of a Gaussian smoothing kernel)
>> tstar  = spm1d.rft1d.t.isf(a, df, nNodes, fwhm);   %critical threshold
>> disp( tstar )
    3.9168

The rougher the data are, the smaller the FWHM value becomes, and the higher the threshold becomes:

>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 8) )
    4.0849
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 6) )
    4.3069
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 4) )
    4.6296

The threshold is higher because, the rougher random data are, the more likely they are to reach an arbitrary height. This smoothness-dependent threshold underlies all spm1d tests and indeed, in a nutshell, summarizes the entire SPM methodology.

While the RFT thresholds above have been shown to be accurate for 1D data (Pataky, 2016), it is also known that RFT thresholds become too conservative when the data become very rough (Friston et al., 2007). To deal with this problem spm1d, like other SPM software packages, defaults to the Bonferroni threshold when it is smaller than the RFT threshold. In spm1d this can be done simply by adding the optional argument "withBonf" ("with Bonferroni correction") to the above calculations:

>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 8, 'withBonf',true) )
    4.0849
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 6, 'withBonf',true) )
    4.3069
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 4, 'withBonf',true) )
    4.6296

You should see that there is no change from the results above, because the data are still suitably smooth (i.e. the FWHM is suitably large). However, when the FWHM value becomes quite small you should see that the critical threshold plateaus as follows:

>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 4.0, 'withBonf',true) )
    4.6296
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 3.5, 'withBonf',true) )
    4.7384
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 3.0, 'withBonf',true) )
    4.7880
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 2.5, 'withBonf',true) )
    4.7880
>> disp( spm1d.rft1d.t.isf(a, df, nNodes, 2.0, 'withBonf',true) )
    4.7880

In other words, when the data become too rough they are statistically indistinguishable from nNodes independent processes (at least from the RFT perspective).

If your data are quite rough (FWHM < 5) this may explain why the threshold is constant.

Incidentally, the above interpretation echoes signal processing theory, and in particular the Nyquist criterion, which roughly states that data should sampled at a a frequency of at least 2f in order to accurately capture a signal with a frequency of f. The mathematics underlying that theory transfer implicitly into RFT calculations, via the FWHM parameter.

If your data have a quite small FWHM value it could imply that sampling frequency is inadequate and/or that high frequency noise has not been suitably suppressed.

Todd

References

Friston, K. J., Ashburner, J. T., Kiebel, S. J., Nichols, T. E., & Penny, W. D. (2007). Statistical Parametric Mapping: The Analysis of Functional Brain Images. London: Elsevier.

Pataky, T. C. (2016). rft1d: Smooth One-Dimensional Random Field Upcrossing Probabilities in Python. Journal of Statistical Software, 71(7), 1–22. http://doi.org/10.18637/jss.v071.i07

gillianweir commented 7 years ago

Hi Todd,

Thank you so much for the detailed insight. To follow up part 1 of your response: I ran the ttest with nNodes at 100 here: spmi_100_ I then interpolated this to 101 points and got a very very slightly higher zstar value: spmi_101_

With regards to your final comments, I had a similar thought. What I am attempting to measure is coordination variability of different segment coupling angles. So in that sense we are essentially attempting to measure biological noise, and fwhm values are typically <5 for data that I have looked at. Does this then violate the assumptions of SPM?

Gill

0todd0000 commented 7 years ago

Hi Gill,

Right: the Bonferroni correction changes with the number of nodes. When the data are rough spm1d reverts to the Bonferroni correction. Thus changing the number of nodes changes the threshold.

By contrast, when the data are smooth spm1d uses the RFT correction which is invariant to the number of nodes provided sampling frequency is adequately high. You could use 100, 1,000 or 1,000,000 nodes and the RFT correction should be roughly constant. In other words, the Bonferroni correction is a node-dependent, smoothness-independent correction and the RFT correction is a smoothness-dependent, node-independent correction.

FWHM values are typically <5 for data that I have looked at. Does this then violate the assumptions of SPM?

No, it doesn't violate SPM's assumptions. It just shifts inference from RFT procedures to Bonferroni procedures. SPM's assumptions are precisely the same as usual classical tests for scalar data: normally distributed and independent residuals, random sampling, etc. SPM makes no assumptions about smoothness.

Roughness (FWHM < 5) instead suggests that sampling theory fundamentals are not being followed. The signal needs to be sampled with a suitably high frequency. If sampling frequency is too low then the signal probably shouldn't be analyzed at all (using SPM or any other method).

Todd