Whenever the categorical feature that we are trying to one hot encode has a larger number of categories, Then in the featurize.py, line 118 " i.todense() " will unpack a huge array and the program stops there. Instead of np.hstack() we can use scipy.sparse.hstack() to directly stack sparse matrices without converting it into dense matrix. Also scipy.sparse.hstack() supports appending sparse and dense matrices together.
scipy.sparse.hstack() is working perfectly for regression use cases. But coming to the classification problems, most of the algorithms use something like x_train.todense() which again brings up the same error.
Whenever the categorical feature that we are trying to one hot encode has a larger number of categories, Then in the featurize.py, line 118 " i.todense() " will unpack a huge array and the program stops there. Instead of np.hstack() we can use scipy.sparse.hstack() to directly stack sparse matrices without converting it into dense matrix. Also scipy.sparse.hstack() supports appending sparse and dense matrices together.