Non-parametric SPM with non-linear registration

IntiVM commented 1 year ago

Hi,

I am currently exploring the nonlinear registration SPM from the recently published publication (Simultaneously assessing amplitude and temporal effects in biomechanical trajectories using nonlinear registration and statistical nonparametric mapping).

I have managed to run the analysis (mostly from the example plot "fig_datasetA.py") on my own dataset, but I have a couple of questions to improve my understanding of the interpretation:

'niter' is set to "5" in all scripts. Is this data-dependent or is this a standardized parameter that should always be set to 5?
Same for 'nperm', which is currently always set to 1000
Using the "fig_datasetX" script, we obtain a very nice visual representation of the time and amplitude effects, but I was wondering if this information (the exact timing of the supra-treshold cluster) is also printed somewhere, similar to the SPM1D scripts?

Thank you for the help!

0todd0000 commented 1 year ago

Hello!

As a general response, sensitivity to all parameters should be checked. If small changes in a parameter induce large changes in the results then the original parameter value is likely a poor one.

'niter' is set to "5" in all scripts. Is this data-dependent or is this a standardized parameter that should always be set to 5?

niter is the same as the MaxItr parameter that is defined in the srsf_align function in fdasrsf/time_warping.py. It is a numerical convergence parameter. It should probably be increased to a larger value like 10 or 20, then gradually to smaller values until the results begin to become unstable. For the datasets in this study niter=5 was deemed suitable to yield numerical convergence in all datasets.

Same for 'nperm', which is currently always set to 1000

You should start with 10,000 or 100,000 in general then similarly reduce until the results begin to become unstable. More details about this parameter are available in these threads: #233, #187 , #63

Using the "fig_datasetX" script, we obtain a very nice visual representation of the time and amplitude effects, but I was wondering if this information (the exact timing of the supra-treshold cluster) is also printed somewhere, similar to the SPM1D scripts?

To get the cluster endpoints you need to get the SPM objects, which are currently buried inside the plot_multipanel function. So you'd either need to (a) modify the "fig_datasetX" scripts to actually run the hypothesis tests, or (b) modify the plot_multipanel function so that it returns the SPM objects. Once you have the spm objects as the output of a command like this:

spmi = spm1d.stats.ttest_paired( yA, yB ).inference(0.05)

then you can obtain the endpoints using:

endpoints = [c.endpoints for c in spmi.clusters]

This will yield a list of tuples like this:

[ (11.7, 18.6), (89.1, 100.0)]

which represent the start and end of each cluster.

IntiVM commented 1 year ago

Dear Todd,

Many thanks for your elaborate reply! Q1 & Q2 are clear after reading the additional issues you have referred me to.

I would like to follow up on Q3. I think I managed to import the object, but I get following error message: _"spm1d.stats.datachecks.SPM1DError: Unequal number of responses in (J x Q) arrays (J1=34, J2=20). J1 must equal J2."

My patient group is indeed bigger than my control group. Does this pose a problem to obtain the SPM object?

Thank you in advance!

0todd0000 commented 1 year ago

The current version of spm1d does not support unbalanced ANOVA designs. To get around this you can either:

(a) impose balance by selecting randomly selecting a balanced subset randomly, then repeating for several random subsets, or
(b)skip directly to post hoc tests without explicitly conducting ANOVA; spm1d supports unbalanced group sizes for t tests

Please find a few more details about unbalanced designs in these threads: #195, #182, #159

IntiVM commented 1 year ago

My apolagies, I had mistakenly ran a paired t-test instead of ttest2, which is obviously impossible with imbalanced groups.

I now realise that I can get the timepoints of the initial ttest by indeed using spmi = spm1d.stats.ttest2(y[J:], y[:J]).inference(0.05) endpoints = [c.endpoints for c in spmi.clusters] print(endpoints)

And the timepoints of the post-hoc test (amplitude shift) by using: spmi = spm1d.stats.ttest2(yr[J:], yr[:J]).inference(0.05) endpoints = [c.endpoints for c in spmi.clusters] print(endpoints)

From my (currently limited) understanding of the stats in the 'multipanel' part: ti = spm1d.stats.ttest2(y[J:], y[:J]).inference(0.05) T2i = spm1d.stats.hotellings2(Y[J:], Y[:J]).inference(0.05) tri = spm1d.stats.ttest2(yr[J:], yr[:J]).inference(0.05 / 2) twi = spm1d.stats.ttest2(d[J:], d[:J]).inference(0.05 / 2)

I thought I could obtain the endpoints for time shift by using: spmi = spm1d.stats.ttest2(d[J:], d[:J]).inference(0.05) endpoints = [c.endpoints for c in spmi.clusters] print(endpoints)

But this part does not seem correct as I get the error "Zero variance detected at the following nodes: [0, 0, 0, 100]"

My apologies if this is something obvious, my python experience is rather limited. Could you point me to what I am doing wrong here?

Thank you!

0todd0000 commented 1 year ago

The warps are constrained to be zero at the domain endpoints (time=0% and time=100%) so the domain endpoints must be excluded from analyses. Try cropping the endpoints from the warps like this:

dc   = d[:,1:-1]  # cropped
spmi = spm1d.stats.ttest2(dc[J:], dc[:J]).inference(0.05)

IntiVM commented 1 year ago

Works perfectly, thanks for the quick response!

IntiVM commented 1 year ago

Hi Todd, Sorry to have to re-open this already, but I haven't quite figured everything out just yet unfortunately.

Since my results are different for parametric & non-parametric tests and data has been checked for outliers, I will most probably result the non-parametric results. I thus wanted to check the endpoints for the nonparam inference. When I try this:

spmi = spm1d.stats.nonparam.ttest2(y[J:], y[:J]).inference(0.05) endpoints = [c.endpoints for c in spmi.clusters] print(endpoints)

I get the error Warning('The total nuumber of permutations (%d) is very large and may cause computational problems. To enable non-parametric calculations for this many iterations set "force_iterations=True" when calling "inference". NOTE: Setting "force_iterations=True" may require substantial computational resources and may cause crashes. USE WITH CAUTION.'%

I have tried to change 'force_iterations' to 'true' in the _spm.py script but I'm not sure if that works because I get the same error after making that modification. I am thus not entirely sure how to successfully obtain the endpoints for the non-parametric inference.

Further, I have another questions linked to non-parametric analysis. I have used the "figdatasetA.py" script applied on my own data as a basis to plot my results. When I modify "parametric = true" to "parametric = false"_

I get different results. Parametric: Non-parametric:

I assume this is correct, but the y-axis labels (SPM{t}) are not modified to SnPM{t}. Am I then correct in assuming that plot 2 are indeed the non-parametric results but that the axis labels are not modified?

Many thanks, Inti

0todd0000 commented 1 year ago

Parametric and non-parametric results will generally be numerically different, but they are rarely qualitatively different. In the figures you posted there are clearly numerical differences, but the qualitative differences appear to be negligible. This is typical.

Regarding the meaning permutations / iterations, please see: #123, #91, #63

I am unable to reproduce the error you report. Please try running the following script (a modification of: this script):

yA         = np.array([ 1.764,  0.400,  0.978,  2.240,  1.867, -0.977, 1.764,  0.400,  0.978,  2.240,  1.867, -0.977])
yB         = np.array([ 0.950, -0.151, -0.103,  0.410,  0.144,  1.454])

tn         = spm1d.stats.nonparam.ttest2(yA, yB)
tni        = tn.inference(0.05, two_tailed=True, force_iterations=True)
print(tni)

Changing force_iterations to False will produce the warning, but setting it to True should suppress the warning. If the error is still generated, please try updating to the most recent version if you have already: version = 0.4.14 (2022-02-06)

0todd0000 / spm1d

Non-parametric SPM with non-linear registration #244