pydata / sparse

Sparse multi-dimensional arrays for the PyData ecosystem
https://sparse.pydata.org
BSD 3-Clause "New" or "Revised" License
594 stars 124 forks source link

dtype: NotImplementedError: float16 #263

Open SultanOrazbayev opened 5 years ago

SultanOrazbayev commented 5 years ago

Hello, I am trying to reduce the memory footprint of the matrices containing small integers, and want to set the dtype to np.int8 or np.float16.

import sparse
import numpy as np
x=sparse.COO(np.random.rand(5*10**2, 5*10**2)).astype(np.float16)
y = sparse.dot(x,x)
z = sparse.dot(y,y)

This is the error I get:

myenvpath/lib/python3.6/site-packages/sparse/coo/common.py in _dot_coo_coo_type(dt1, dt2)
   1208 
   1209     @numba.jit(nopython=True, nogil=True,
-> 1210                locals={'data_curr': numba.numpy_support.from_dtype(dtr)})
   1211     def _dot_coo_coo(coords1, data1, coords2, data2):  # pragma: no cover
   1212         """

myenvpath/lib/python3.6/site-packages/numba/numpy_support.py in from_dtype(dtype)
    106             return types.NestedArray(subtype, dtype.shape)
    107 
--> 108     raise NotImplementedError(dtype)
    109 
    110 

NotImplementedError: float16
hameerabbasi commented 5 years ago

Two questions:

  1. Does it work with Numba 0.44?
  2. Does it work with other dtypes?
SultanOrazbayev commented 5 years ago
  1. No, it doesn't work with Numba 0.44 (and sparse 0.7).
import sparse
import numba
import numpy as np
numba.__version__, sparse.__version__
>('0.44.0', '0.7.0')

x=sparse.COO(np.random.rand(5*10**2, 5*10**2)).astype(np.float16)
y = sparse.dot(x,x)
z = sparse.dot(y,y)
  1. The code above works for the following dtypes: np.float32, np.float_, np.int_... Actually, it seems that the only dtype the code doesn't work for is np.float16.
hameerabbasi commented 5 years ago

Okay. Then let me give you a bit of background: float16 is a bit of an oddball. It doesn’t exist in the C language (and neither in most others) and CPUs don’t have hardware instructions for it (GPUs do). CPUs just cast to float32, do the operations, and convert back.

I’ll try to get a reproducer for this and report it to the Numba team (who are responsible for this bug) but I suspect it will be a low priority.

In the meantime, I suggest you don’t worry too much about memory footprint: The coordinates are stored as intp and you’re unlikely to get much savings because of that.

SultanOrazbayev commented 5 years ago

Thanks!

hameerabbasi commented 5 years ago

xref numba/numba#4402