Open matquant14 opened 11 months ago
Looks like there is an explicit sort_by
pl.Series(list("defza")).to_dummies(separator="").columns
# ['a', 'd', 'e', 'f', 'z']
Is sorting by default the more expected option for the usage of .get_dummies()
?
It seems maintain_order=True|False
(group_by) and sort=True|False
(value_counts) currently exist.
Description
I have a DataFrame with 10 categories. I want to one hot encode them, making a 10x10, but when using the to_dummies function, it appears to sort the columns according to alphabetical order. I'd like to maintain the order, essentially creating a 10x10 identity matrix, but w/ the column names reflecting the original order of the categories. Here's an example of what I'm experiencing
to get what I want I have to add a select expression
Can an argument be added to the to_dummies function that maintains the category (or string) order?