rapidsai / cuxfilter

GPU accelerated cross filtering with cuDF.
https://docs.rapids.ai/api/cuxfilter/stable/
Apache License 2.0
273 stars 67 forks source link

[BUG] cudf Series code failing: "TypeError: use cudf.Series._from_data" #618

Closed jameslamb closed 1 month ago

jameslamb commented 1 month ago

Describe the bug

On 24.10, several tests are failing like this:

TypeError: Use cudf.Series._from_data for constructing a Series from ColumnAccessor or a ColumnBase

Observed this in CI that wasn't touching any Python code (e.g. #616).

Steps/Code to reproduce bug

CI from #616: (build link)

Expected behavior

N/A

Environment details (please complete the following information):

Using the latest cudf 24.10 nightlies.

This does not appear to be limited to a specific subset of Python version, CPU architecture, or CUDA version. It affects both wheels and conda packages.

Additional context

I believe this is the same root cause as https://github.com/rapidsai/cuspatial/issues/1433. And there are similar reports in other RAPIDS projects:

cuxfilter needs to adapt to these changes: https://github.com/rapidsai/cudf/pull/16454

For reference, here's how @mroeschke is approaching this in cuspatial: https://github.com/rapidsai/cuspatial/pull/1434

jameslamb commented 1 month ago

Looking more closely at the stack traces in logs, this might not require changes in cuxfilter. It's possible all of these errors are coming from cuspatial:

/pyenv/versions/3.9.19/lib/python3.9/site-packages/cuxfilter/charts/core/non_aggregate/core_non_aggregate.py:142: in cb
    self.selected_indices = point_in_polygon(self.source, *args)
/pyenv/versions/3.9.19/lib/python3.9/site-packages/cuxfilter/charts/core/non_aggregate/utils.py:7: in point_in_polygon
    points = cuspatial.GeoSeries.from_points_xy(
/pyenv/versions/3.9.19/lib/python3.9/site-packages/cuspatial/core/geoseries.py:704: in from_points_xy
    GeoColumn._from_points_xy(as_column(points_xy, dtype=coords_dtype))
/pyenv/versions/3.9.19/lib/python3.9/site-packages/cuspatial/core/_column/geocolumn.py:140: in _from_points_xy
    meta = GeoMeta(
/pyenv/versions/3.9.19/lib/python3.9/site-packages/cuspatial/core/_column/geometa.py:31: in __init__
    self.input_types = cudf.Series(meta["input_types"], dtype="int8")
/pyenv/versions/3.9.19/lib/python3.9/site-packages/cudf/utils/performance_tracking.py:51: in wrapper
    return func(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

And that https://github.com/rapidsai/cuspatial/pull/1434 will fix this for cuxfilter.

mroeschke commented 1 month ago

Looks like tests from this most recent test build are generally passing now https://github.com/rapidsai/cuxfilter/actions/runs/10355831367

Is it safe to close this out?

jameslamb commented 1 month ago

Yep, I see other test failures there but it does look like this issue has been addressed by the changes from cuspatial. I agree we can close this.

Thanks @mroeschke !