Open VibhuJawa opened 1 year ago
Thank you for posting this issue. fillna
seems to be a function that has issues with list type support.
FWIW here is a shorter repro:
import cudf
df = cudf.DataFrame({'a':[[1], [1], None,None,None]})
subdf = cudf.DataFrame({'a':[None, None, [2], [2], [2]]})
df.fillna(subdf)
Bump.
xref: https://github.com/rapidsai/cugraph/issues/3010
Other examples using cudf.Series
:
import cudf
s = cudf.Series([[1], None, [3], None])
t = cudf.Series([None, [2], None, [4]])
# These all fail
s.fillna(t)
s.fillna(cudf.Scalar([0]))
s.fillna([0])
s[s.isnull()] = t
s[s.isnull()] = t[s.isnull()]
## These work:
# s[s.isnull()] = [0]
# t[t.isnull()] = cudf.Scalar([0])
It would be straightforward to add the Python layer for this (probably add ListColumn.fillna
), but it would still need a low-level implementation (perhaps in CUDA called from cpp/src/replace/nulls.cu:replace_nulls_column_kernel_forwarder
).
Describe the bug
fillna with dataframe containing list dtypes fails .
Steps/Code to reproduce bug
Expected behavior
I expect this to work like it does for non list columns.
Environment overview (please complete the following information)
Additional context
Impacts property graph in cugraph where we use it for replacing fillna
https://github.com/rapidsai/cugraph/blob/cb0d0923616f656ec816f999aa633ecbf3c57267/python/cugraph/cugraph/structure/property_graph.py#L758
https://github.com/rapidsai/cugraph/blob/cb0d0923616f656ec816f999aa633ecbf3c57267/python/cugraph/cugraph/structure/property_graph.py#L1155