Iteration warning during nonparam-ttest2

emiltoft commented 4 years ago

Dear Todd,

I'm comparing to groups using the nonparam.ttest2 in python and I get this warning: " The total nuumber of permutations (43430966148115) is very large and may cause computational problems. To enable non-parametric calculations for this many iterations set "force_iterations=True" when calling "inference". NOTE: Setting "force_iterations=True" may require substantial computational resources and may cause crashes. USE WITH CAUTION. "

I have tried with "force_iterations=True", but it never finishes. I looked up the code and found that the max number of iterations is 10000. I have tried to apply different iteration values and get different p-values for each iteration value. Can you explain the following: What is these iterations? What is the best solution for my case when I get this error?

Kind regards, Emil

0todd0000 commented 4 years ago

Hello!

What is these iterations?

A single iteration consists of:

randomly relabeling the observations, and
computing the 1D test statistic and its maximum value (zmax)

The zmax values are saved over N iterations, yielding a distribution of zmax values. This numerical distribution (consisting of N different zmax values) is used to make statistical inferences, including computation of the critical threshold.

When the data are normally distributed, this numerical distribution of zmax values converges to the (parametric) random field theory distribution. So if the data are reasonably normal, the parametric and non-parametric critical thresholds should be very similar.

This is a type of permutation test, and the spm1d implementation follows Nichols & Holmes (2002). In Nichols & Holmes (2002) you will find a nice explanation of permutation tests in the "Single voxel example" section.

What is the best solution for my case when I get this error?

In general, 10,000 iterations is probably sufficient.

Permutation-based results are not expected to be precisely reproducible; if you select N = 1000 iterations, for example, and run inference multiple times, you will likely see numerical variation across runs. However, the critical threshold is expected to converge to a numerically stable value as the number of iterations becomes large. The number of iterations should either be:

The maximum possible number of iterations (for small sample sizes), or
A large enough value that yields numerical stability across multiple runs. Typically N = 10,000 is sufficient.

Todd

emiltoft commented 4 years ago

Hi Todd,

Thank you for good and clear answer. I will have a look at the Nichols & Holmes (2002) reference.

Kind regards, Emil

0todd0000 commented 4 years ago

You're welcome!

0todd0000 / spm1d

Iteration warning during nonparam-ttest2 #116