pydata / patsy

Describing statistical models in Python using symbolic formulas
Other
954 stars 104 forks source link

Design Matrix Error When Specifying Mixed Effects Formula #168

Closed filpia closed 3 years ago

filpia commented 3 years ago

Cross-posting from StackOverflow.

I'm trying to yield a design matrix similar to the example below. I can specify I want a random intercept for each group k but cannot get a random intercept and random slope on x for each group k unless x and k are both of type np.int64. It breaks when the pair are of types (int64, float64) or (float64, float64).

Breaking example

import pandas as pd
import patsy as pt
import numpy as np

df = pd.DataFrame({
    'y':[0,0,1,1],
    'x':map(lambda x: float(x), np.arange(0,8,2)),
    'k': [1,2,1,2]
})

print(df.dtypes)
pt.dmatrices(
    'y ~ x + (1+x|k)', data=df
)

>>> y      int64
>>> x    float64
>>> k      int64
>>> dtype: object
>>>
>>> PatsyError: Error evaluating factor: TypeError: unsupported operand type(s) for |: 'float' and 'int'
    y ~ x + (1+x|k)

Working example

import pandas as pd
import patsy as pt
import numpy as np

df = pd.DataFrame({
    'y':[0,0,1,1],
    'x':np.arange(0,8,2),
    'k': [1,2,1,2]
})

print(df.dtypes)

pt.dmatrices(
   'y ~ x + (1+x|k)', data=df
)
tomicapretto commented 3 years ago

Hi @filpia,

Patsy does not support formulas for mixed-effects models. Also, it is no longer under active development. In the meantime, you can use formulae if you want to use mixed-effects formulas right now. But in the long term, I would recommend using Formulaic, because it is the one that is going to be actively developed and mainatined. It does not support mixed-effects formulas yet, but it is going to support it soonish.