holoviz / datashader

Quickly and accurately render even the largest data.
http://datashader.org
BSD 3-Clause "New" or "Revised" License
3.24k stars 363 forks source link

Implement lines using 2D xarray with common x coordinates #1282

Closed ianthomas23 closed 8 months ago

ianthomas23 commented 9 months ago

This is a work in progress to implement datashading lines from a 2D xarray that has common x coordinates, as described in issue #1278.

So far it only works for count and any reductions, with and without Dask but only on the CPU not GPU. Further work is underway to support other simple reductions, categorical and where reductions, antialiasing, CUDA, etc.

There is a limitation when using dask chunking of the 2D array in the x direction in that the line segments between chunks are not rendered. This is the same limitation that occurs in other datashader line glyphs that similarly chunk along arrays, and no attempt will be made to address the limitation in this PR.

This work is part of a CZI Round 5 grant.

codecov[bot] commented 9 months ago

Codecov Report

Merging #1282 (863f9d4) into main (d338d98) will decrease coverage by 0.01%. The diff coverage is 86.47%.

@@            Coverage Diff             @@
##             main    #1282      +/-   ##
==========================================
- Coverage   85.63%   85.63%   -0.01%     
==========================================
  Files          52       52              
  Lines       11128    11265     +137     
==========================================
+ Hits         9530     9647     +117     
- Misses       1598     1618      +20     
ianthomas23 commented 8 months ago

Here is an example using nchannel lines of random data, each line offset from its neighbours.

import colorcet as cc
import datashader as ds
import numpy as np
import xarray as xr

nx = 100
x = np.linspace(0.0, 1.0, nx)

nchannel = 26
channel = [chr(i) for i in range(65, 65+nchannel)]

rng = np.random.default_rng(8321)
data = rng.normal(loc=np.arange(nchannel).reshape((-1, 1)), scale=0.5, size=(nchannel, nx))

xr_ds = xr.Dataset(
    data_vars=dict(
        some_name=(("channel", "x"), data),
        value=("channel", np.arange(nchannel))
    ),
    coords=dict(
        channel=("channel", channel),
        x=("x", x),
    ),
)
print(xr_ds)

canvas = ds.Canvas(plot_height=600, plot_width=800)
agg = canvas.line(source=xr_ds, x="x", y="some_name", agg=ds.first("value"))
im = ds.transfer_functions.shade(agg, how="linear", cmap=cc.glasbey_cool)

which produces

<xarray.Dataset>
Dimensions:    (channel: 26, x: 100)
Coordinates:
  * channel    (channel) <U1 'A' 'B' 'C' 'D' 'E' 'F' ... 'U' 'V' 'W' 'X' 'Y' 'Z'
  * x          (x) float64 0.0 0.0101 0.0202 0.0303 ... 0.9697 0.9798 0.9899 1.0
Data variables:
    some_name  (channel, x) float64 0.5883 0.06726 -0.8486 ... 24.96 26.22 25.42
    value      (channel) int64 0 1 2 3 4 5 6 7 8 ... 17 18 19 20 21 22 23 24 25

and this image lines_xarray_example

ianthomas23 commented 8 months ago

This now works for xarray DataArrays that are backed by

but not for dask-cupy arrays.

ianthomas23 commented 8 months ago

Also note that the 2D xr.DataArray can have coordinates either way round, i.e. the common x coords can be dimension 0 or 1.