Open lukasheinrich opened 6 years ago
Sweet! That looks really cool. I'll poke at their docs over the weekend and then we can see if it is any more work to add in then normal.
i'd be especially interested in whether we can use the normal scipy based optmization with that. or if we also have to code one ourselves
This is looking promising, given this section of the docs:
maybe relevant https://github.com/cupy/cupy/issues/1196
Ah, so I'll need to set things up at my cluster to actually do this, as the requirements to even install include CUDA. So this isn't just CUDA enabled, but actually built using CUDA (this actually sounds like a good thing to me).
As an update, I'm still working on this, but I'm waiting to hear back on the HPC admins on my cluster. For technical reasons they might make me submit a list of all software that I want and they will setup the testing environment for me. This could be nice, but might result in a few days delay.
Something else that I should look into is ClPy: OpenCL backend for CuPy, as we're going to want to be able to have CuPy tests run in CI and Travis doesn't support CUDA infrastructure.
once we get a installation of CuPy it would be very interesting to get a feel of what the hardware speed up looks like. Even before implementing the backend we could run this notebook
probably only the tb = np
lines in cell [2]
need to be adapted if I understood the link
https://docs-cupy.chainer.org/en/stable/tutorial/basic.html#how-to-write-cpu-gpu-agnostic-code
correctly. right now we get a factor 400 comparing a naive implementation with the new vectorized one (from #251 )
numpy: 127 µs ± 12.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
naive: 56.1 ms ± 4.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Additional overview material: Shohei Hido - CuPy: A NumPy-compatible Library for GPU, PyCon 2018
More additional material: ContinuumIO's Numba and CuPy tutorial
Description
following the discussion in #231 it should be quite easy to get a CuPy backend into pyhf
https://cupy.chainer.org/
it's advertised as a drop-in replacement for numpy. This will give us GPU acceleration (like TF and PyTorch, but separate from autodifferentiation -- which might be nice to have)
@matthewfeickert has some experience in writing these backends and iirc access to a CUDA enabled machine.