Closed jbednar closed 6 years ago
Yes, you're going about this wrong, the default aggregator is count
which does nothing sensible for an Image, you can use aggregate/datashade if you set the appropriate x_sampling
and y_sampling
and an aggregator like ds.mean
. What you really want though is the regrid
operation.
Repurposed the issue to document regrid
in HoloViews.
Hmm. Regrid is better, but doesn't work either:
Can't reproduce that, this works fine:
img = hv.Image(np.arange(10)* np.arange(10)[np.newaxis].T)
img + regrid(img)
What version of datashader do you have? I know it's less convenient but I always add the code and plot separately.
Github master (0.6.1-6-g09973d3).
Anyway, I'm surprised that a separate operation is needed. If I do provide an aggregator, I get something more recognizable:
But it's clearly converted the image to points, which I would have expected to require datashade(hv.Points(hv.Image(r)))
; I was not expecting datashade
to change the type of what I provided.
You should try the code in the first cell above, but with datashade replaced by regrid. In your range based example I can't tell if it's working properly or not, but with the above code I can see that it definitely is not.
Could you paste the code?
Sigh...there appears to be another bug in datashader.
The copyable code is in the first cell...
(Only the very last line changes in any of the examples here).
I know this isn't the way it's currently implemented, but the way I expected this to work is:
Each of the indicated agg defaults is the default value in datashader (for None).
I know raster() doesn't currently use the same agg options, but I think it should eventually, at least for downsampling, so it would be nice to unify that at the ds level. Otherwise, is this very different from how you think it should be?
I'd be happy for aggregate to provide some convenience around regrid
but I don't think it replaces it because that is more general than what aggregate
can provide. Apart from that the main thing I'm worried about is that it would have to ignore the default count
aggregator
and that some others like count_cat
and any
are also pointless for an Image.
I know raster() doesn't currently use the same agg options, but I think it should eventually, at least for downsampling, so it would be nice to unify that at the ds level.
The agg objects are a bit weird in the case of images because there is no column to refer to, so there's no reason to do ds.mean('z')
and just using ds.mean
would still be inconsistent.
Well, all of the canvas glyphs actually default to None; can't we use that?
any
isn't pointless for an image with an alpha mask; it should basically extract that mask at a different resolution. count_cat
is meant to be generalized to cat(count...)
, and some of the aggregators provided to it may be more meaningful.
Fixed in https://github.com/bokeh/datashader/pull/475, please test when you get a minute.
The bug is fixed, but I do think we need to address the API issues still.
As a first step we'd have to support ds.mean()
, ds.min()
etc. so the API in datashader is at least consistent. Currently that results in:
TypeError: __init__() missing 1 required positional argument: 'column'
It may be mentioned in another issue, but I would like datashade
to do regridding when passed a raster type. This doesn't mean there can always be a dedicated regrid
operation for more advanced control with its own documentation.
If you agree with that idea, I would document the regridding capability of datashade
first (once it is added) as passing rasters to datashade
seemed like a pretty intuitive thing and something I expected to 'just work'.
I believe that's exactly what we already decided above.
I'm talking about documentation: we should document datashade
first as being able to do regridding then the regrid
operation itself.
I agree with all that. Yes, sounds like we need to make those changes to ds.mean(), etc.
Those changes are now done, as of Datashader 0.6.5, so hv can now be updated to match.
I'll hand this over to you once I've improved rasterize
.
While you're working on these docs there are a few performance related things to point out:
precompute
can be used to cache the last set of data, for dataframes/dask tabular datasets and xarray gridded datasets this makes no difference, for TriMesh/QuadMesh this will make a huge differencegv.operation.project
.
I'm not sure if I'm going about this the right way, but I don't seem to get anything reasonable out of datashading an Image:
I had expected this to use datashader's raster support, but when first loading there is no image at all in the second subfigure, and then on zooming out and back in I get various crazy patterns like this.
Eventually I don't want an Image, but an RGB where each of 3 or 4 underlying images is datashaded, but I can't see how I would even express that (
datashade(hv.RGB(...))
?). I'm trying to implement https://anaconda.org/jbednar/landsat using holoviews, but am not having much luck.