Closed navdeep-G closed 6 years ago
I was thinking of utilizing py datatable somehow.
@navdeep-G It would be ideal to use py datatable on the backend, while keeping the scikit-learn API. Here's a list of all the scikit-learn preprocessing methods that we could use (and fall back to scikit-learn when GPU not supported?): http://scikit-learn.org/stable/modules/classes.html#module-sklearn.preprocessing
Yes, but both operations are CPU. So no need to fallback.
@navdeep-G I thought py datatable was GPU capable?
@ledell py datatable is pure CPU as of today.
Researching more and I think sklearn's capabilities achieve what is needed and most users of Python machine learning libraries will know how to use these methods.
@navdeep-G What kind of data prep? I assume you mean things like label encoding, one-hot encoding & imputation? Can we just expose the Scikit-learn methods for this, or do we need to write new methods from scratch?