MICA-MNI / BrainStat

A statistics and context decoding toolbox for neuroimaging.
https://brainstat.readthedocs.io
Other
93 stars 22 forks source link

Inconsistent sum of `FixedEffect` elements #290

Closed NicolasGensollen closed 2 days ago

NicolasGensollen commented 2 years ago

When adding FixedEffect terms from a list, I was expecting to get the same results when using the sum() function and reducing with lambda x, y: x + y. However, using sum() results in an additional column x0.

Here is a MWE:

from functools import reduce
from brainstat.stats.terms import FixedEffect
from brainstat.tutorial.utils import fetch_mics_data

thickness, demographics = fetch_mics_data()
model = []
term_age = FixedEffect(demographics.AGE_AT_SCAN)
model.append(term_age)
print(reduce(lambda x, y: x + y, model))
print(sum(model))
intercept  AGE_AT_SCAN
0           1           27
1           1           25
2           1           33
3           1           36
4           1           31
..        ...          ...
77          1           30
78          1           33
79          1           26
80          1           26
81          1           29

[82 rows x 2 columns]
    x0  intercept  AGE_AT_SCAN
0    0          1           27
1    0          1           25
2    0          1           33
3    0          1           36
4    0          1           31
..  ..        ...          ...
77   0          1           30
78   0          1           33
79   0          1           26
80   0          1           26
81   0          1           29

[82 rows x 3 columns]

I was expecting the same results.

I am using BrainStat 0.3.6.

zihuaihuai commented 2 days ago

Hello,

In the implementation you're referring to, the add operation doesn't perform pairwise addition as one might typically expect with numerical values. Instead, it functions more like an "append" operation in the context of combining terms or variables in a statistical model.

Explanation:

When you use Term A + Term B, the operation combines the two terms by adding the variables from Term B that are not already present in Term A to Term A. This results in a new term that includes all unique variables from both Term A and Term B.

Example:

Let's consider two terms representing different variables in a design matrix for a linear model:

Term A: Represents variables [X1, X2] Term B: Represents variables [X2, X3]

Term A + Term B will output [X1, X2, X3] since X3 is the only col that in B but not in A.

Cause

So the cause of this problem is just when calling sum it is 0 + termAge. This addition of a numeric type and FixedEffect type is not well handled.