Closed mekazanc closed 8 months ago
Hi @mekazanc ,
There's no treatment related to categorical variables in this package, all treatment happens in this function and they tend to be more related to the structure of input.
Probably the best workaround is to apply a one-hot-encoding transformation before processing the data.
You said that you found low weights for the correlation of those variables, I'm supposing that you used this variable as a float value. This will not work as those values have no real meaning and therefore any correlation extracted is due to noise, not signal.
I got it! Thanks for your explanation.
Hi again, I have a little question :)
Assume that we have an ordinal categorical variable (e.g. app_version) whose correlation > 0.6 and our target is app. installs.
Can we give this categorical variable directly to the model ? or Should we transform it somehow first and then give it to the model ?
My observation is that all of the control variables (e.g. different countries installs) have the same unit with target but this variable is not. Also, model did not give it high coefficient (< 0.1) even though having high correlation after a few runs!
I could not find any example regarding categorical variables in the sample notebook so that I wanted to ask here.
Thanks in advance