Open shutUpAndCode opened 7 years ago
I'm fairly happy with everything but the ordinal variables, I found a good paper about it but has anyone built regression models with ordinal variables before?
What I'm thinking as a starter for 10 (ignoring the ordinal issue for the moment), Binarize the categorical variables, see if any of our variables are highly correlated and remove them, then perform some kind of feature selection regression (lasso and ridge regression seem like a good start).
For now can't we just order them equally spaced? for example good, neutral and bad would become 1,2 and 3. Ordering them in a clever way would be another modelling problem which could be useful to solve but probably best not doing for now. What are your thoughts?
So the data we have is as follows: 23 categorical variables, 24 ordinal variables, 19 continuous variables, 13 discrete (numerical) variables
79 in total
and 1 dependent variable (house price)