Closed hguturu closed 1 year ago
Hi @hguturu ,
Parentheses if formulae have special meaning (they are grouping order-of-operation operators). You can refere to the formula grammar docs for more info. You'll also find there how to quote special characters that should be included in field names; for example:
In [12]: all_phenotypes = pd.DataFrame({ "(AltGrp)": [1, 0, 0, 1, 0, 1], "BinGrp": [0, 0, 0, 1,
...: 1, 1], "ContGrp" : [1,2,3,4,5,6]})
...:
...: design = formulaic.model_matrix(["`(AltGrp)` + BinGrp"], all_phenotypes)
In [13]: design
Out[13]:
(AltGrp) BinGrp
0 1 0
1 0 0
2 0 0
3 1 1
4 0 1
5 1 1
However, there is a bug here... AltGrp
is not found in the data sets, but is not throwing an exception. This is a regression, and so I'll make sure it gets fixed.
Ah... I see you opened an issue about this separately anyway (#159 ). Closing this one in favour of that.
I am trying to make a design matrix from a master matrix of parameters.
yields
I assume this is due to the
()
in(AltGrp)
. I was curious if there are other special characters that should be excluded since this fails silently so I want to avoid passing in the wrong matrix in the future.