Open mrocklin opened 6 years ago
I was playing around with @numba.stencil
and parallel operations and ran into some usability problems. I raised them here https://github.com/numba/numba/issues/2982
Things seem fine now. That issue contains decently usable examples and a small discussion on when parallelism may or may not be helpful.
Incidentally, a crazy idea I had is that we could have both Cython and Numba by using Cython's new Pure Python annotations. We could then (a) compile if available, but (b) ship pure Python anyway, and (c) use Numba if available. There's some hairy logic to figure out, but it might be doable and if it works I think it would be pretty awesome. =)
Also incidentally, my experience re optimizing and debugging Numba code is that it is still a long way from Cython. The only very nice thing is that you can NUMBA_DISABLE_JIT=1 when it comes to testing and get coverage reports for your Numba functions, which is pretty cool. (I'm not sure how coverage works with Cython actually.)
@jni We'd be most interested in hearing about any specific issues you've experienced in optimizing and debugging Numba. Also, about what features the cython stack has that you feel are missing in Numba.
The most recent Numba (0.38, https://github.com/numba/numba/blob/5ba86a9ad3c2c425b48bc16e4adb908a8830bc9c/CHANGE_LOG#L1-L143) has had a load of user facing improvements added based on community feedback. We have a tracking ticket here: https://github.com/numba/numba/issues/2888, please could you file any issue(s)/feedback against that?
This https://github.com/numba/numba/pull/2793 is WIP and brings magics that provide more Numba diagnostic capabilities to tools like Jupyter/ipython and via an API elsewhere, it is hoped that this will make the next release.
Thanks.
It would be pretty useful to have closer interop between Dask and Numba so one could add a custom function to operate on blocks and have it JIT'd for high performance. Remember running into some pickling issues when I tried this last, which may or may not be a blocker depending on how this is implemented (i.e. compilation before vs. after transmission).
I think that serialization is no longer a problem. Also, you may appreciate https://github.com/numba/numba/issues/2979
On Wed, May 23, 2018 at 1:39 PM, jakirkham notifications@github.com wrote:
It would be pretty useful to have closer interop between Dask and Numba so one could add a custom function to operate on blocks and have it JIT'd for high performance. Remember running into some pickling issues when I tried this last, which may or may not be a blocker depending on how this is implemented (i.e. compilation before vs. after transmission).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scisprints/2018_05_sklearn_skimage_dask/issues/9#issuecomment-391435737, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszJm26gck701-K8ZfvK_FMRo9by_oks5t1Z7agaJpZM4UG9xh .
@stuartarchibald I've updated https://github.com/numba/numba/issues/2888 with some specific concerns. Thank you!
@jni great, thanks for doing that, much appreciated.
In https://github.com/scikit-image/scikit-image/wiki/UC-Berkeley-(BIDS)-sprint,-May-28-Jun-2-2018 scikit-image devs write the following:
I suspect that there are a few things within Numba that we might want to investigate, depending on time
General performance
We could look at using Numba to rewrite existing Cython code to see if there is any difference in CPU or memory use, and also how the development experience feels.
We might want to read through this document on performance to guide work here: http://numba.pydata.org/numba-doc/dev/user/performance-tips.html
Parallelism
Numba has both implicit parallelism and explicit pranges.
https://numba.pydata.org/numba-doc/dev/user/parallel.html
Stencil operators
Many of the scikit-image functions are well defined on a local neighborhood. This might be a good fit for the experimental stencil decorator:
https://numba.pydata.org/numba-doc/dev/user/stencil.html#numba-stencil
GPU algorithms
I'm not sure what the comfort level is of those attending, but we might also consider looking at writing CUDA code with Numba
https://numba.pydata.org/numba-doc/dev/cuda/index.html
Disadvantages
We should look at its behavior on ARM and Power architectures and see if the current bugs affect scikit-image use.
We should also get a feel for debugging and troubleshooting to see how it compares to the Cython experience.
cc @stefanv @emmanuelle @jni @kne42 @jakirkham