holoviz / datashader

Quickly and accurately render even the largest data.
http://datashader.org
BSD 3-Clause "New" or "Revised" License
3.31k stars 366 forks source link

Add support for categorical where reductions #1237

Closed ianthomas23 closed 1 year ago

ianthomas23 commented 1 year ago

Fixes #1210.

This adds support for categorical where reductions on CPU and GPU, with and without Dask.

An example is

canvas = ds.Canvas(ny, nx)
agg = canvas.points(... agg=ds.by("cat", ds.where(ds.max_n("mass", n=3))))

This returns a 4D xarray.DataArray of shape (ny, nx, ncat, n) containing for each pixel and category the indexes of the 3 rows in the supplied DataFrame that have the maximum values of the "mass" column.

To return the values from another column instead of row indexes this would be

agg = canvas.points(... agg=ds.by("cat", ds.where(ds.max_n("mass", n=3), "other")))

We can replace max_n in this example with max, min, first, last, min_n, first_n, or last_n.

Support is also added for

ds.by("cat", ds.first("value"))

and the last, first_n and last_n equivalents as these are implemented using where under certain circumstances (GPU and/or Dask).

codecov[bot] commented 1 year ago

Codecov Report

Merging #1237 (38c83f6) into main (5a89820) will decrease coverage by 0.15%. The diff coverage is 71.59%.

@@            Coverage Diff             @@
##             main    #1237      +/-   ##
==========================================
- Coverage   83.52%   83.37%   -0.15%     
==========================================
  Files          35       35              
  Lines        8778     8832      +54     
==========================================
+ Hits         7332     7364      +32     
- Misses       1446     1468      +22     
Impacted Files Coverage Δ
datashader/reductions.py 77.87% <69.51%> (-0.76%) :arrow_down:
datashader/compiler.py 88.65% <100.00%> (+0.05%) :arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

ianthomas23 commented 1 year ago

After rebase tests are failing with some bokeh-panel incompatibility when running examples. That is nothing to do with this PR, so merging this and will deal with example problem separately.