Normally, when one is encoding a categorical variable, one would have k - 1 dummy variables for k levels. This decision is made to avoid the issue of multicollinearity when building a regression model. For instance, if the variable were sex and the levels were male and female, then you would only need a single "sex" column.
In Pandas, this feature is implemented with the "drop_first" parameter of the panda.get_dummies() function. However, I see no such feature in the Polars equivalent.
You could just drop the unnecessary column. This would actually give you more/easier control over who you'd keep as a control group in an inference context.
Problem description
Normally, when one is encoding a categorical variable, one would have k - 1 dummy variables for k levels. This decision is made to avoid the issue of multicollinearity when building a regression model. For instance, if the variable were sex and the levels were male and female, then you would only need a single "sex" column.
In Pandas, this feature is implemented with the "drop_first" parameter of the panda.get_dummies() function. However, I see no such feature in the Polars equivalent.