pydata / sparse

Sparse multi-dimensional arrays for the PyData ecosystem
https://sparse.pydata.org
BSD 3-Clause "New" or "Revised" License
585 stars 125 forks source link

sparse and numba #378

Open jodemaey opened 4 years ago

jodemaey commented 4 years ago

Hi there,

I am using a homemade tensor class in my model code together with numba functions for tensor products. If I understand well, sparse also use numba to accelerate algorithms.

I'm very happy to have found sparse because I don't like to reinvent the wheel. So in the future I would rather use sparse than my homemade solution, but I'm concerned because I use numba in numerous places inside the code. So my question is: Can sparse be used inside njit decorated functions? Is there something like a tensor jitclass?

Thank you in advance.

hameerabbasi commented 4 years ago

No it can't currently, but we're working to make this possible.

hameerabbasi commented 4 years ago

There is basic support for accessing the properties but not for performing operations.

jodemaey commented 4 years ago

Ok thanks, I think I will start a new branch and try it to see the performance without using numba.

hameerabbasi commented 4 years ago

We're looking at generating tensor products and contractions specifically in the branch second-taco and the issue #365.

jodemaey commented 4 years ago

I'm really sorry but I do not understand how this format implementation is related to sparse tensor object being usable inside a numba njit decorated function?

hameerabbasi commented 4 years ago

We're writing it in such a way that it generates JIT-able code from the get-go.

jodemaey commented 4 years ago

Ok I think I see it now. Thank you very much. Last question: Will that also include being able to pass a sparse array to a njitted function?

hameerabbasi commented 4 years ago

We plan on it. 😄

jodemaey commented 4 years ago

Cool, I will start using sparse for my code in parts that do not use numba and then move fully to it when this implementation is finished. Thank you very much for your answers.

jodemaey commented 2 years ago

Sorry to come back at you about this, but I was curious to see if it is possible now to pass a sparse tensor to a njitted function, and I have noticed that it does not complain anymore at the interface level for the COO class but latter about function implementation:

import sparse as sp
from numba import njit

@njit
def add(a):
    return a + 1

x = np.random.random((4,4))
x[x<0.5] = 0.
s = sp.COO(x)
add(s)
Traceback (most recent call last):
  File "/home/jodemaey/anaconda3/envs/qgs/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-34a7f28281ed>", line 1, in <module>
    add(s)
  File "/home/jodemaey/anaconda3/envs/qgs/lib/python3.8/site-packages/numba/core/dispatcher.py", line 482, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/jodemaey/anaconda3/envs/qgs/lib/python3.8/site-packages/numba/core/dispatcher.py", line 423, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function add>) found for signature:

 >>> add(COOType[float64, int64, 2], Literal[int](1))

There are 18 candidate implementations:
  - Of which 16 did not match due to:
  Overload of function 'add': File: <numerous>: Line N/A.
    With argument(s): '(COOType[float64, int64, 2], int64)':
   No match.
  - Of which 2 did not match due to:
  Operator Overload in function 'add': File: unknown: Line unknown.
    With argument(s): '(COOType[float64, int64, 2], int64)':
   No match for registered cases:
    * (int64, int64) -> int64
    * (int64, uint64) -> int64
    * (uint64, int64) -> int64
    * (uint64, uint64) -> uint64
    * (float32, float32) -> float32
    * (float64, float64) -> float64
    * (complex64, complex64) -> complex64
    * (complex128, complex128) -> complex128
During: typing of intrinsic-call at <ipython-input-4-e685fec80380> (3)
File "<ipython-input-4-e685fec80380>", line 3:
def add(a):
    return a + 1
    ^
sp.__version__
Out[9]: '0.12.0'
import numba
numba.__version__
Out[11]: '0.54.0rc3'

However, for the DOK class I still get the former type error :

add(sp.DOK(x))
Traceback (most recent call last):
  File "/home/jodemaey/anaconda3/envs/qgs/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-12-d150d4b5fa24>", line 1, in <module>
    add(sp.DOK(x))
  File "/home/jodemaey/anaconda3/envs/qgs/lib/python3.8/site-packages/numba/core/dispatcher.py", line 482, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/jodemaey/anaconda3/envs/qgs/lib/python3.8/site-packages/numba/core/dispatcher.py", line 423, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
non-precise type pyobject
During: typing of argument at <ipython-input-4-e685fec80380> (3)
File "<ipython-input-4-e685fec80380>", line 3:
def add(a):
    return a + 1
    ^ 
This error may have been caused by the following argument(s):
- argument 0: Cannot determine Numba type of <class 'sparse._dok.DOK'>

so I was wondering what has changed, and also, if one defines properly the function implementation for the COO class, would it works?

Thank you in advance,

Jonathan

hameerabbasi commented 2 years ago

Yes, passing sparse arrays into njit functions is experimental and you shouldn't rely on it at all. Currently, you can pass COO but only access its attributes.

jodemaey commented 2 years ago

Thanks for your response. You still plan to get there at some point right? That would be a great feature for me.

Best,

Jonathan

hameerabbasi commented 2 years ago

We plan on offering similar features, without requiring use of Numba. See the TACO project for more details.

jodemaey commented 2 years ago

My codes use numba njit a lot to accelerate loops, and so accelerated operations on sparse tensors is only one side of the story for me. This is why being able to pass a sparse tensor to a njitted function is important to me. Right now the solution that I have is to convert the coordinates and values of COO to numpy arrays, which works but is not very elegant.

I understand that it is not a priority but this is why I keep coming back asking for this (I will not ask anymore I promise :-) .