Running analysis on N = 1

jlit29 commented 3 months ago

Hi Todd,

I have a fairly large dataset that is all from one person. I was looking at using SPM to compare left and right side data, pre and post a specific date. However, the data is only from 1 individual. I ran into some errors trying to use SPM that I assume were from the N = 1. I was wondering if there was a way to work around this, as I am specifically interested in testing the L/R differences and pre/post specific date but only on this one individual. Thank you.

0todd0000 commented 3 months ago

Hello! If there are multiple measurements then analysis should be possible. If there is just one observation then statistical analysis of any kind is not possible. All statistical analyses require variance estimates, which in turn require N > 1.

jlit29 commented 3 months ago

There are multiple measurements: 36 individual signals total, 9 in each group (Left/Right, Pre/Post), but all from the same subject. When I try to run the SPM, I get IndexError: tuple index out of range. When you say it should run with multiple measurements, do you mean my current design would work, or does it require multiple subjects (and they each have multiple measurements). Thanks again Todd.

0todd0000 commented 3 months ago

When you say it should run with multiple measurements, do you mean my current design would work, or does it require multiple subjects (and they each have multiple measurements).

Multiple observations are required but they needn't come from multiple individuals:

If the observations come from one individual, then inference will pertain to the population of all possible observations from that individual.
If instead the observations come from multiple individuals, then inference will pertain to the population of individuals from which those specific individuals were drawn.

There are multiple measurements: 36 individual signals total, 9 in each group (Left/Right, Pre/Post), but all from the same subject. When I try to run the SPM, I get IndexError: tuple index out of range.

It is difficult to know what is causing this error. Can you please copy-and-paste your code into this thread, along with a copy of the full error message?

jlit29 commented 3 months ago

Here is the code:

# Libraries
import pandas as pd
import spm1d
import numpy as np
from matplotlib import pyplot
import sys

# Import data
df = pd.read_csv(f'{name}_jumpdata.csv', index_col=0)

# Set columns
A = df["LR_Group"]   # L/R, L is 1, R is 0
B = df["PrePost"]   # Pre/Post, Pre is 0, Post is 1
SUBJ = df["SUBJ"]   # Subj column

# Separate data from variables
Y = df.drop(columns = ["LR_Group", "PrePost", "SUBJ"])
Y = np.array(Y)
A = np.array(A)
B = np.array(B)
SUBJ = np.array(SUBJ)

# SPM
FF = spm1d.stats.anova2rm(Y, A, B, SUBJ)
FFi = [F.inference(alpha=0.05) for F in FF]

Here is the error message when running FF = spm1d.stats.anova2rm(Y, A, B, SUBJ) :

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\taylorj\AppData\Roaming\Python\Python311\site-packages\spm1d\stats\anova\ui.py", line 228, in anova2rm
    design  = designs.ANOVA2rm(A, B, SUBJ)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\taylorj\AppData\Roaming\Python\Python311\site-packages\spm1d\stats\anova\designs.py", line 294, in __init__
    self._assemble()
  File "C:\Users\taylorj\AppData\Roaming\Python\Python311\site-packages\spm1d\stats\anova\designs.py", line 311, in _assemble
    builder.add_main_columns('S', XS)
  File "C:\Users\taylorj\AppData\Roaming\Python\Python311\site-packages\spm1d\stats\anova\designs.py", line 52, in add_main_columns
    i0,n             = self.ncol, X.shape[1]
                                  ~~~~~~~^^^
IndexError: tuple index out of range

0todd0000 commented 3 months ago

The error is caused by the SUBJ variable. The SUBJ variable must indicate how the observations are paired across the condition combinations. In multi-subject designs this is relatively easy: simply enter an integer code for each subject. To use an RM design for one subject the observations must have some meaningful pairing. If there are just multiple measurements, and if they were not explicitly paired across the conditions, then normal two-way ANOVA: spm1d.stats.anova2 would be more appropriate (i.e., without repeated measures):

FF = spm1d.stats.anova2(Y, A, B)

0todd0000 / spm1d

Running analysis on N = 1 #295