Closed FabiolaMestanza closed 2 years ago
Your two questions are both very important, complex questions that are difficult to answer. I have provided preliminary responses below. Since these questions go well beyond specific spm1d procedures, I'd suggest seeking additional support from a statistician and/or from general statistics forums.
Question 1: Which kind of approach would you suggest us to run this analysis? Is there a test more suitable for us in the spm1D package?
This is a difficult question to answer because the appropriate approach for single patient vs. healthy population comparisons depends largely on the precise purpose, the experimental design, and the nature of the healthy population data. The most straightforward case is a large sample of healthy individuals (N>100 or N>1000) in which case you can use the healthy mean as the datum for a one-sample test. A more complex case is a small, random sample of healthy individuals (N<100), in which case a two-sample test may be more appropriate BUT where sample size and power considerations can become complex.
This question is highly complex, and also quite general, pertaining to all types of data including simple scalar data, so I'd recommend posting this question to Stack Overflow's statistics forum, but in terms of simple scalar data like body mass.
Question 2: ...it would be useful to summarize the results of the SPM test with a zero-dimensional variable; so that each patient will have a single summary value of gait performance. Would this be methodologically possible?
It is possible but I suggest against using a test statistic (t value or T2 value) because these values are sample-size dependent; they generally increase as sample size increases, even when the true difference is constant. One option is to use the p-value associated with the maximum test statistic value, which is demonstrated here. However, since those p-values are generally not valid when large (p>0.5), it might be a better idea to use a more common difference metric like the RMSE between two sample means. There are several other difference / similarity metric options like mutual information. There are also several different software packages that directly support difference / similarity metric calculations including See scikit-image and similarity_measures.
Hi all,
Just saw this and was intrigued. I think techniques like the Gait Deviation Index (https://pubmed.ncbi.nlm.nih.gov/18565753/), Dynamic Motor Control Index (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4683117/), and other similar techniques are potentially useful in this instance.
Regards, Bernard
Thank you for the suggestions! They are not directly relevant to current spm1d procedures so I'm closing this issue. If you meant these as feature requests please post them to #45 . Thanks!
Dear Prof. Pataky,
We are a research group of the University of Milan, and we are applying Statistical Parametric Mapping on gait kinematic data of people with multiple sclerosis. Our aim is to quantify how much the kinematics of each patient differs from that of healthy controls and SPM seems a good candidate, but we have some methodological questions that we would like to address with you. Question 1: We have assessed by a gait analysis system hip, ankle and knee kinematics of people with MS. We have at least four trials for each single patient, and reference kinematics data from healthy subjects matched for gait speed. Now we would like to compare each single person with ms to the reference kinematics values to assess difference from the physiological pattern and to provide a summary score (for each person with MS) to discriminate between more and less impaired patients. We are using the spm1D package. Since we are comparing each patient’s angular waveforms of hip, knee, and ankle flexion with the mean waveforms of healthy subjects. Which kind of approach would you suggest us to run this analysis? Is there a test more suitable for us in the spm1D package?
Question 2: to discriminate between more and less impaired people with ms, it would be useful to summarize the results of the SPM test with a zero-dimensional variable; so that each patient will have a single summary value of gait performance. Would this be methodologically possible? Intuitively, if a supra-threshold cluster is present, lower p-values would indicate greater differences between a given patient and controls. However, spm1d does not calculate p-values for non-significant comparisons - even if we see that this feature is in the to-do list. Alternatively, could the sum of SPM{T2} scores over the one-dimensional continuum be used as a synthetic measure of dissimilarity between a given patient and healthy reference values? Or would you suggest other possible zero-dimensional variables that we can use for our purpose?
Thank you very much for your understanding and your support in advance.
Best regards, Davide Cattaneo, Fabiola Mestanza, Francesco Luciano
We attached data analysis for a typical subject.
HC_KNEE.txt MIGA_ANKLE_R.csv MIGA_HIP_R.csv MIGA_KNEE_R.csv HC_ANKLE.txt HC_HIP.txt `################################
HOUSEKEEPING
################################
Clear workspace
%reset -f
####################################
IMPORT LIBRARIES
####################################
Import os
import os
Import numpy and pandas
import numpy as np import pandas as pd
Import matplotlib for data visualization
import matplotlib.pyplot as plt
Import spm1d to perform SPM
import spm1d
#####################################################
IMPORT AND VIEW DATA FROM PT GROUP
#####################################################
File name of the Pt Hip waveforms
fnamePH = 'MIGA_HIP_R.csv'
Load patients hip data as numpy array
PH = np.loadtxt(fnamePH, delimiter=";")
Plot imported waveforms
plt.plot(PH.T)
File name of the Pt Knee waveforms
fnamePK = 'MIGA_KNEE_R.csv'
Load patients knee data as numpy array
PK = np.loadtxt(fnamePK, delimiter=";")
Plot imported waveforms
plt.plot(PK.T)
File name of the Pt Ankle waveforms
fnamePA = 'MIGA_ANKLE_R.csv'
Load patients ankle data as numpy array
PA = np.loadtxt(fnamePA, delimiter=";")
Plot imported waveforms
plt.plot(PA.T)
#####################################################
IMPORT AND VIEW DATA FROM CTRL GROUP
#####################################################
File name of the Ctrl Hip waveforms
fnameCH = 'HC_HIP.txt'
Load control hip data as numpy array
CH = np.loadtxt(fnameCH, delimiter=";")
Plot imported waveforms
plt.plot(CH.T)
File name of the Ctrl Knee waveforms
fnameCK = 'HC_KNEE.txt'
Load control knee data as numpy array
CK = np.loadtxt(fnameCK, delimiter=";")
Plot imported waveforms
plt.plot(CK.T)
File name of the Ctrl Ankle waveforms
fnameCA = 'HC_ANKLE.txt'
Load control ankle data as numpy array
CA = np.loadtxt(fnameCA, delimiter=";")
Plot imported waveforms
plt.plot(CA.T)
#################################################
PREPARE ARRAYS FOR HOTELLING'S TEST
#################################################
J = number of responses
Q = number of nodes to which the 1D responses have been resampled
I = number of vector components
See: https://spm1d.org/doc/Stats1D/multivariate.html
arr0=np.dstack((CH, CK, CA)) arr1=np.dstack((PH, PK, PA))
Check that dimensions of arr0 and arr1 are correct (J Q I)
print(arr0.shape) print(arr1.shape)
#################################################
PERFORM HOTELLING'S TEST
################################################# T2 = spm1d.stats.hotellings(arr1, arr0) T2i = T2.inference(0.05) print(T2i)
T2i.plot()
z_Test=T2i.z print(z_Test)
print(T2i.clusters)
#################################################
SUM OF T2
#################################################
print(np.shape(z_Test))
print("Sum:") print(np.sum(z_Test))`