bambinos / formulae

Formulas for mixed-effects models in Python
https://bambinos.github.io/formulae/
MIT License
56 stars 14 forks source link

Add `as_dataframe` method for group level design matrix #98

Open drbenvincent opened 1 year ago

drbenvincent commented 1 year ago

Simulated data:

groups = 3
samples_per_group = 10
N = groups * samples_per_group
x = np.tile(np.linspace(0, 10, samples_per_group), groups)
group = np.repeat(np.arange(groups), samples_per_group)
y = 2 + -0.5 * group + rng.normal(0, 0.1, N)
df = pd.DataFrame({"x": x, "group": group, "y": y})

If we model this with y ~ 1 + (1 | C(group)) and were interested in the design matrices, then at the moment we can do this:

dm = design_matrices("y ~ 1 + (1 | C(group))", df)
dm.common.as_dataframe()

But we cannot call the as_dataframe method on the group level design matrix. Ie. we can only call

np.array(dm.group)

which result in this Screenshot 2023-06-18 at 09 59 31

but it would be nice to be able to call

dm.group.as_dataframe()

Not knowing how the internals work, I believe it would just be a matter of building a dataframe with the correct column labels, which should be 1|C(group)[0], 1|C(group)[1], and 1|C(group)[2].

This would be particularly useful for teaching and learning purposes.

tomicapretto commented 1 year ago

Interesting idea indeed!