pangeo-data / pangeo-docker-images

Docker Images For Pangeo Jupyter Environment
https://pangeo-docker-images.readthedocs.io
MIT License
117 stars 91 forks source link

Add cupy to ml notebooks #322

Open rabernat opened 2 years ago

rabernat commented 2 years ago

It should be easy to support cupy in both ml notebooks, no?

ngam commented 2 years ago

Should be easy, yes, though as you said in another thread, the ML notebooks/images should ideally be ~strictly~ anchored around a framework. For now, the current frameworks are tensorflow and pytorch. There will likely be pulling and shoving between the heavyweights if we keep adding stuff in. To keep it in line of pytorch--tensorflow separation, I would think having a RAPIDS image might be more sensible. As I mentioned in the latter part of this comment, I think following NGC containers may be the most sustainable in the future, though to be honest/clear, I have no idea what the larger scope of these images will be... so I could be misunderstanding things!

ngam commented 2 years ago

Adding cupy to #345

weiji14 commented 1 year ago

There's been some interest at https://github.com/xarray-contrib/xbatcher/issues/87 on having a GPU direct storage demo which uses cupy-xarray and kvikIO (a RAPIDS AI library), and it would be good to have a working docker image with CuPy (that can be readily deployed on some cloud provider) to showcase the functionality.

So, my question is: what's the conclusion of the experimental 'super-ml-notebook' docker container PRs at https://github.com/pangeo-data/pangeo-docker-images/pull/345#issuecomment-1157711813 and #369? Is the idea to:

  1. Have a single super-ml container with Pytorch+Tensorflow+CuPy, or
  2. Have 3 separate docker images (Pytorch, Tensorflow/JAX, RAPIDS/CuPy)?

Personally, I would vote for a separate docker image with CuPy, CuML, kvikIO, and other RAPIDS AI libraries, but I also acknowledge that having 3 separate ML docker images can be a maintenance burden as mentioned in #188. Still unsure if having a 3 in 1 image is any easier :laughing:, but wanted to hear people's thoughts on what's a good medium term (<1 year) plan.