I need to fit a model to predict the continuous variable. But the problem with the data is - primary key is of String type field and i have huge data set so can't do one hot encoding of it. I can't drop that column bcoz it is not properly primary key as some of the records are getting repeated and testing data also contains data with respect to some values of that field and its corr value is also large.
So how can I resolve this issue?
Thanks,
Neeru Singla
Hi ,
I need to fit a model to predict the continuous variable. But the problem with the data is - primary key is of String type field and i have huge data set so can't do one hot encoding of it. I can't drop that column bcoz it is not properly primary key as some of the records are getting repeated and testing data also contains data with respect to some values of that field and its corr value is also large. So how can I resolve this issue? Thanks, Neeru Singla