MICA-MNI / BrainStat

A statistics and context decoding toolbox for neuroimaging.
https://brainstat.readthedocs.io
Other
93 stars 22 forks source link

Unexpected `TypeError` in `remove_duplicate_columns` #292

Closed NicolasGensollen closed 3 days ago

NicolasGensollen commented 2 years ago

When summing FixedEffect instances built from a Pandas DataFrame with converted dtypes, the function remove_duplicate_columns in brainstat/stats/terms.py raises the following error:

File ~/GitRepos/BrainStat/brainstat/stats/terms.py:243, in remove_duplicate_columns(df, tol)
    241 df *= 1 / tol
    242 # keep = df.round().T.drop_duplicates(keep="last").T.columns  # Slow!!
--> 243 idx = np.unique(df.round().values, axis=1, return_index=True)[-1]
    244 keep = df.columns[sorted(idx)]
    245 return keep

TypeError: The axis argument to unique is not supported for dtype object

Here is a MWE:

import pandas as pd
from brainstat.stats.terms import FixedEffect

df = pd.DataFrame({"age": [20.2, 33.1, 25.0],
                   "sex": ["Female", "Male", "Male"],
                  }).convert_dtypes()
df = df.apply(lambda x: x.astype('category') if x.name == "sex" else x)
FixedEffect(df.age) + FixedEffect(df.sex)

No crash.

Python 3.9.11 Brainstat 0.3.6