Open hjr1949 opened 7 months ago
You are correct, this is the normal correlation coefficient.
You will also find that the t-statistic values that spm1d yields are the same as those from standard t tests.
SPM calculates test statistic values in exactly the same way that classical methods do. The only real difference between SPM and traditional methods occurs at the inference stage (i.e., when probability values are calculated). SPM using random field theory (RFT) to calculate the probabilities that smooth Gaussian random fields will produce threshold test statistic continua various topological features. RFT-based inference is implemented in several open-source packages but is not included in standard commercial software like Excel, MATLAB, SPSS, etc. You can read more about RFT here: https://spm1d.org/rft1d/Theory.html
Thanks for your replying。 (1)So, does SPM only differ from traditional statistics in how significance is determined? (2) Are there any easy-to-understand learning materials or videos available?
------------------ 原始邮件 ------------------ 发件人: "0todd0000/spm1dmatlab" @.>; 发送时间: 2024年4月22日(星期一) 中午11:18 @.>; @.**@.>; 主题: Re: [0todd0000/spm1dmatlab] correlation coefficient obtained using "spm.r" (Issue #203)
You are correct, this is the normal correlation coefficient.
You will also find that the t-statistic values that spm1d yields are the same as those from standard t tests.
SPM calculates test statistic values in exactly the same way that classical methods do. The only real difference between SPM and traditional methods occurs at the inference stage (i.e., when probability values are calculated). SPM using random field theory (RFT) to calculate the probabilities that smooth Gaussian random fields will produce threshold test statistic continua various topological features. RFT-based inference is implemented in several open-source packages but is not included in standard commercial software like Excel, MATLAB, SPSS, etc. You can read more about RFT here: https://spm1d.org/rft1d/Theory.html
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
I'm so sorry for the delay.
(1)So, does SPM only differ from traditional statistics in how significance is determined?
This is a difficult question to answer. The most general way to answer it is that SPM is a methodological framework for generalizing classical statistical inference to the case of scalar and vector fields that operate over n-dimensional manifolds. This does indeed involve probability calculations that differ from those used in classical statistical inference. However, it can be shown mathematically that SPM's probability calculations converge to classical calculations when the n-dimensional manifold collapses to a single point. Therefore, from this perspective, SPM's probability calculations are conceptually identical to classical procedures' probability calculations.
(2) Are there any easy-to-understand learning materials or videos available?
Please consider the spm1d online workshop which covers many SPM fundamentals.
according to your previous answer: "SPM calculates test statistic values in exactly the same way that classical methods do. The only real difference between SPM and traditional methods occurs at the inference stage (i.e., when probability values are calculated)".
When I compare the two methods ( I interpolate and standardize the time, I perform traditional significance tests for all discrete points.), it seems that only when the p-value in traditional statistical analysis is around 0.01, there is a significant difference in the SPM. Since the calculation of t-values is the same, this means that the difference between the two is only in the determination of significant differences. Therefore, I am confused, which statistical method's conclusions should be the primary?
---Original--- From: "Todd @.> Date: Sat, Jul 20, 2024 05:50 AM To: @.>; Cc: @.**@.>; Subject: Re: [0todd0000/spm1dmatlab] correlation coefficient obtained using"spm.r" (Issue #203)
I'm so sorry for the delay.
(1)So, does SPM only differ from traditional statistics in how significance is determined?
This is a difficult question to answer. The most general way to answer it is that SPM is a methodological framework for generalizing classical statistical inference to the case of scalar and vector fields that operate over n-dimensional manifolds. This does indeed involve probability calculations that differ from those used in classical statistical inference. However, it can be shown mathematically that SPM's probability calculations converge to classical calculations when the n-dimensional manifold collapses to a single point. Therefore, from this perspective, SPM's probability calculations are conceptually identical to classical procedures' probability calculations.
(2) Are there any easy-to-understand learning materials or videos available?
Please consider the spm1d online workshop which covers many SPM fundamentals.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Short answer: SPM's results should be primary because (1) SPM accurately controls $\alpha$ for 1D data, and because (2) pointwise inference does not accurately control $\alpha$ for 1D data.
Longer answer: First please allow me to define the general terms: "0D methods" and "1D methods", where both are shorthand for "nD classical, parametric hypothesis testing methods". With these terms, "0D methods" encompass what you have called "traditional significance tests" above and include standard t-tests, regression, ANOVA all the way through MANCOVA. These 0D methods were developed from around the early 1900s through to the mid-1900s. I use "0D" to indicate that these are point processes. In other words, the measured dependent variable does not vary over some nD domain.
"1D methods", as shorthand for "1D classical, parametric hypothesis testing methods" encompass mainly SPM and FDA, but note that the primary goal of Functional Data Analysis (FDA) is not classical hypothesis testing. Regardless, both SPM and FDA can be regarded as "1D methods". SPM's theoretical core was developed in the 1970s but SPM itself did not appear until the 1990s. FDA as a branch of statistics also appeared in the 1990s.
Regardless of the method, the primary goal of all classical hypothesis testing techniques is to control the Type I error rate at the user-defined rate of $\alpha$ (usually $\alpha$=0.05). Type I error is an incorrect conclusion of "effect" when in fact there is no effect. By extension, $\alpha$=0.05 means that, when there is truly no effect, random sampling will yield Type I error in 5% of an infinite number of identical experiments. The key is that a valid classical hypothesis testing method must control the Type I error rate at $\alpha$.
SPM controls Type I error rate at $\alpha$ for continuous univariate (scalar) and multivariate (vector, tensor) data operating over arbitrary nD domain geometries. For the case of 1D domains the geometries are limited to: (1) a single line segment and (2) multiple, detached line segments. In most 1D applications the domain is just a single line segment. Regardless of the domain geometry, SPM uses the smoothness of the 1D residuals to determine the critical $\alpha$-based test statistic value (e.g. critical t-value, critical F-value). Consider the figure below, which depicts progressively smoother 1D residuals, and is copied from rft1d documentation.
The 1D residuals in panel (a) are totally uncorrelated, can be considered "infinitely rough" and have a smoothness parameter of FWHM=0. The 1D residuals in panel (f) are nearly perfectly correlated, approach "infinitely smooth" and have a large smoothness parameter value approaching infinity. The other panels depict intermediate cases between the two extremes of infinite roughness and infinite smoothness. Note that the residuals in most biomechanical datasets fall into the range depicted in panels (b) through (e).
Consider first the two extremes:
Next consider the intermediary cases in panels (b) through (e):
When the 1D residuals are neither infinitely rough nor infinitely smooth, the residual smoothness parameter (FWHM) is roughly inversely proportional to the number of independent processes; this is not a mathematically accurate description of the FWHM parameter, but it is nevertheless an effective conceptual aid for interpreting the FWHM parameter and its relation to SPM tests. When FWHM is small there are many independent processes, and when FWHM is large there are few independent processes. SPM uses the estimated FWHM to effectively infer the number of independent processes and to thereby derive a Bonferroni-like correction for multiple comparisons. Again, this is not a mathematically accurate description of what SPM does, but it is a reasonable conceptual aid for understanding how SPM works across various 1D residual smoothness. The most important point is that SPM can be regarded most simply as a smoothness-dependent correction for multiple tests.
The key difference between 0D methods and 1D methods is therefore:
it seems that only when the p-value in traditional statistical analysis is around 0.01, there is a significant difference in the SPM.
If you have set a critical p-value of 0.01 and have used 0D methods to analyze a specific dataset, you may indeed have approximated an SPM solution for that case. However, if the 1D smoothness changes, you will need to change to a different critical p-value in order to accurately control $\alpha$. This is exactly what SPM does: it effectively calculates the smoothness-dependent critical p-value that is required to retain $\alpha$ at the whole-domain level. SPM also does more than this, but in the context of this discussion, SPM effectively does automatically (based on the estimated smoothness) what you have done manually with critical p = 0.01.
Dear professor Todd: Received, thank you. I need to take some time to digest the knowledge. warm regard
---Original--- From: "Todd @.> Date: Sat, Jul 20, 2024 21:24 PM To: @.>; Cc: @.**@.>; Subject: Re: [0todd0000/spm1dmatlab] correlation coefficient obtained using"spm.r" (Issue #203)
Short answer: SPM's results should be primary because (1) SPM accurately controls $\alpha$ for 1D data, and because (2) pointwise inference does not accurately control $\alpha$ for 1D data.
Longer answer: First please allow me to define the general terms: "0D methods" and "1D methods", where both are shorthand for "nD classical, parametric hypothesis testing methods". With these terms, "0D methods" encompass what you have called "traditional significance tests" above and include standard t-tests, regression, ANOVA all the way through MANCOVA. These 0D methods were developed from around the early 1900s through to the mid-1900s. I use "0D" to indicate that these are point processes. In other words, the measured dependent variable does not vary over some nD continuum.
"1D methods", as shorthand for "1D classical, parametric hypothesis testing methods" encompass mainly SPM and FDA, but note that the primary goal of Functional Data Analysis (FDA) is not classical hypothesis testing. Regardless, both SPM and FDA can be regarded as "1D methods". SPM's theoretical core was developed in the 1970s but SPM itself did not appear until the 1990s. FDA as a branch of statistics also appeared in the 1990s.
Regardless of the method, the primary goal of all classical hypothesis testing techniques is to control the Type I error rate at the user-defined rate of $\alpha$ (usually $\alpha$=0.05). Type I error is an incorrect conclusion of "effect" when in fact there is no effect. By extension, $\alpha$=0.05 means that, when there is truly no effect, random sampling will yield Type I error in 5% of an infinite number of identical experiments. The key is that a valid classical hypothesis testing method must control the Type I error rate at $\alpha$.
SPM controls Type I error rate at $\alpha$ for continuous univariate (scalar) and multivariate (vector, tensor) data operating over arbitrary nD domain geometries. For the case of 1D domains the geometries are limited to: (1) a single line segment and (2) multiple, detached line segments. In most 1D applications the domain is just a single line segment. Regardless of the domain geometry, SPM uses the smoothness of the 1D residuals to determine the critical $\alpha$-based test statistic value (e.g. critical t-value, critical F-value). Consider the figure below, which depicts progressively smoother 1D residuals, and is copied from rft1d documentation.
fig_continua1d.png (view on web)
The 1D residuals in panel (a) are totally uncorrelated, can be considered "infinitely rough" and have a smoothness parameter of FWHM=0. The 1D residuals in panel (f) are nearly perfectly correlated, approach "infinitely smooth" and have a large smoothness parameter value approaching infinity. The other panels depict intermediate cases between the two extremes of infinite roughness and infinite smoothness. Note that the residuals in most biomechanical datasets fall into the range depicted in panels (b) through (e).
Consider first the two extremes:
When the residuals are infinitely smooth as in panel (f), the SPM solution converges (both mathematically and numerically) to the classical 0D solution.
When the residuals are infinitely rough as in panel (a), the SPM solution converges to the classical 0D solution with a Bonferroni correction for multiple comparisons. The Bonferroni correction retains the Type I error rate at $\alpha$ for N uncorrelated tests; note that panel (a) depicts N uncorrelated points.
Next consider the intermediary cases in panels (b) through (e):
When the 1D residuals are neither infinitely rough nor infinitely smooth, the residual smoothness parameter (FWHM) is roughly inversely proportional to the number of independent processes; this is not a mathematically accurate description of the FWHM parameter, but it is nevertheless an effective conceptual aid for interpreting the FWHM parameter and its relation to SPM tests. When FWHM is small there are many independent processes, and when FWHM is large there are few independent processes. SPM uses the estimated FWHM to effectively infer the number of independent processes and to thereby derive a Bonferroni-like correction for multiple comparisons. Again, this is not a mathematically accurate description of what SPM does, but it is a reasonable conceptual aid for understanding how SPM works across various 1D residual smoothness. The most important point is that SPM can be regarded most simply as a smoothness-dependent correction for multiple comparisons.
The key difference between 0D methods and 1D methods is therefore:
0D methods can control $\alpha$ only for the 0D case because they do not consider 1D residual smoothness
1D methods like SPM can control $\alpha$ for the nD case because they consider 1D residual smoothness
If you have set a critical p-value of 0.01 and have used 0D methods (e.g. by using point-by-point inference), you may indeed approximate an SPM solution for that case. However, if the 1D smoothness changes, you will need to change to a different critical p-value in order to accurately control $\alpha$. This is exactly what SPM does: it effectively calculates the smoothness-dependent critical p-value. SPM also does more than this, but in the context of this discussion, SPM effectively does automatically (based on the estimated smoothness) what you have done manually with critical p = 0.01.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Dear Pro Todd: Thank you for your response. I have carefully studied your reply. "For spm1D, the main operational difference between SPM and traditional discrete analysis lies in determining a global statistical significance threshold(for whole curve ). The calculation of the statistical value at each time point (e.g., the T-value at each time point) is actually the same, right?"
------------------ 原始邮件 ------------------ 发件人: "0todd0000/spm1dmatlab" @.>; 发送时间: 2024年7月20日(星期六) 凌晨5:50 @.>; @.**@.>; 主题: Re: [0todd0000/spm1dmatlab] correlation coefficient obtained using "spm.r" (Issue #203)
I'm so sorry for the delay.
(1)So, does SPM only differ from traditional statistics in how significance is determined?
This is a difficult question to answer. The most general way to answer it is that SPM is a methodological framework for generalizing classical statistical inference to the case of scalar and vector fields that operate over n-dimensional manifolds. This does indeed involve probability calculations that differ from those used in classical statistical inference. However, it can be shown mathematically that SPM's probability calculations converge to classical calculations when the n-dimensional manifold collapses to a single point. Therefore, from this perspective, SPM's probability calculations are conceptually identical to classical procedures' probability calculations.
(2) Are there any easy-to-understand learning materials or videos available?
Please consider the spm1d online workshop which covers many SPM fundamentals.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
For spm1D, the main operational difference between SPM and traditional discrete analysis lies in determining a global statistical significance threshold(for whole curve )
This is correct.
The calculation of the statistical value at each time point (e.g., the T-value at each time point) is actually the same, right?
Yes (mostly). The t-statistic, F-statistic and all other statistics are defined identically for 0D
and nD
analyses, regardless of the domain dimensionality n
. I write "Yes (mostly)" because, while the definitions are identical, the actual calculation specifics can vary from software package to software package. These calculation differences are numerically negligible for simple cases like a one-sample t-test, but for more complex cases (e.g. domain-level corrections for nonsphericity in repeated-measures ANOVA designs) there are calculation strategies that are specific to domain-level methodologies like SPM.
Dear Prof Pataky: Why is the correlation coefficient obtained using "spm.r" in MATLAB exactly the same as the one obtained using the "correl" function in Excel? What are the differences between SPM statistical correlation analysis and traditional Pearson correlation analysis? Thank you your time
Huang