Open mroeschke opened 3 weeks ago
hi @mroeschke
Based on change history
NaNs
cp.nan
.dtype_can_compare_equal_to_other
The changes introduce type checks on DecimalDtype
that are not necessary to fix the bug,I think it's over-zealously.
cupy does not fully implement numpy's asarray method, at least dtype does not support Decimal128Dtype
I try to remove cudf.core.dtypes.DecimalDtype,
in fun dtype_can_compare_equal_to_other
, so Decimal128Dtype
as a numeric dtype and can compare equal to other type.
def assert_column_equal(
...
left.apply_boolean_mask(
left.isnull().unary_operator("not")
).values,
...
cudf/cudf/core/column/column.py
@property
def values(self) -> cupy.ndarray:
"""
Return a CuPy representation of the Column.
"""
if len(self) == 0:
return cupy.array([], dtype=self.dtype)
if self.has_nulls():
raise ValueError("Column must have no nulls.")
return cupy.asarray(self.data_array_view(mode="write"))
will raise
TypeError: Cannot interpret 'Decimal128Dtype(precision=1, scale=0)' as a data type
Reproduce the code example:
import cudf
ser = cudf.Series([1], dtype=cudf.Decimal128Dtype(1))
left = ser._column
left.apply_boolean_mask(left.isnull().unary_operator("not")).values
if numpy
import numpy
obj = left.apply_boolean_mask(left.isnull().unary_operator("not"))
numpy.asarray(obj)
Out[11]:
array(<cudf.core.column.decimal.Decimal128Column object at 0x726ea7de4f70>
[
1
]
dtype: decimal128, dtype=object)
if cupy
import cupy
cupy.asarray(obj)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[14], line 1
----> 1 cupy.asarray(obj)
File ~/Code/cudf/.venv/lib/python3.10/site-packages/cupy/_creation/from_data.py:88, in asarray(a, dtype, order, blocking)
56 def asarray(a, dtype=None, order=None, *, blocking=False):
57 """Converts an object to array.
58
59 This is equivalent to ``array(a, dtype, copy=False, order=order)``.
(...)
86
87 """
---> 88 return _core.array(a, dtype, False, order, blocking=blocking)
File cupy/_core/core.pyx:2408, in cupy._core.core.array()
File cupy/_core/core.pyx:2435, in cupy._core.core.array()
File cupy/_core/core.pyx:2574, in cupy._core.core._array_default()
ValueError: Unsupported dtype object
We'll first need to assert that the dtypes are equivalent then probably use pandas assertion functions instead of cupy/numpy for comparing decimal values
Describe the bug
Expected behavior I would expect no
AssertionError
.It appears there's a testing function,
dtype_can_compare_equal_to_other
, used in column comparisons that over-zealously assumes two objects withDecimalDtype
s shouldn't be compared to each other.Environment overview (please complete the following information)