adriangb / scikeras

Scikit-Learn API wrapper for Keras.
https://www.adriangb.com/scikeras/
MIT License
240 stars 47 forks source link

CuPy arrays #64

Open stsievert opened 4 years ago

stsievert commented 4 years ago

Right now this library is tied to NumPy arrays pretty heavily. Will this library work with CuPy arrays? CuPy arrays are NumPy arrays for CUDA GPUs and are nearly a drop-in replacement for NumPy arrays. That'd provide a method to use GPU models + GPU data easily.

Kera's model.fit function only claims to work with the following:

I think it'd be worth exploring what's required to use CuPy arrays.

adriangb commented 4 years ago

I think that neither Scikit-Learn nor Keras support CuPy. If either of them do not support it, I don't think it makes sense to do here.

stsievert commented 4 years ago

CuPy supports NEP18. I don't think adding support would be too difficult. Of course, adding support for CuPy is meaningless unless Keras supports it.

Some Scikit-Learn estimators support CuPy arrays, or will shortly: https://github.com/scikit-learn/scikit-learn/pull/16574, https://github.com/scikit-learn/scikit-learn/pull/17676/ (PCA), https://github.com/scikit-learn/scikit-learn/pull/17744/ (preprocessing).

adriangb commented 4 years ago

This is certainly interesting, but until (1) Keras supports this and (2) Scikit-Learn supports this on at least some estimators I don't see how we could move forward here. I'll leave the issue open for now.

adriangb commented 4 years ago

Looping back to this, yeah I don't see how CuPy arrays could ever be supported. But maybe we can support TPUs by using TF's Numpy API? But will we ever see any performance gains as long as the data being fed in is array-like?

stsievert commented 4 years ago

I think this issue might boil down to this issue: https://github.com/tensorflow/tensorflow/issues/29039

Can SciKeras be used with the GPU?

adriangb commented 4 years ago

It should be, although I have not done any testing (I figure that comes after API and documentation stuff). AFAIK if you select the device in TF, you should be able to use SciKeras with GPU.

adriangb commented 4 years ago

I just went through that thread. Very interesting, although a bit over my head. I'll keep an eye on it.

jakirkham commented 3 years ago

While it's true TensorFlow hasn't implemented __cuda_array_interface__ yet one might be able to go through DLPack instead. TensorFlow has APIs that support DLPack. CuPy can also export itself to DLPack. Kind of adds another step, but maybe that is reasonable in the short-term?