Closed philippjfr closed 6 years ago
Are you suggesting to project after aggregation? I suppose that would be very quick to compute, and therefore seems definitely worth supporting, but presumably there would be issues with uneven bin sizes for at least some projections?
Are you suggesting to project after aggregation?
No, I'm suggesting you should always project explicitly before aggregation.
Ah, so not projecting RGB, but Paths or Points? But if it supports any Element type, presumably it could be used to project the resulting RGB as well, just with less good results?
presumably it could be used to project the resulting RGB as well, just with less good results?
Projecting already gridded data is harder, but I guess that could be supported. I think the cleanest way to implement projecting of gridded data is to treat each pixel as an individual sample, then project the coordinates and then re-aggregate those samples with datashader.
That said I really don't know whether projecting RGB
types is even possible, how would the color interpolation happen? Would reaggregating each channel make sense?
Well, you said any Element, so I thought you had a plan. :-) You're right; for RGB it makes very little sense, and indeed would need some sort of weighted color interpolation. That's a case I would vote not to support.
But for Image, you get the same savings (i.e., a quick and dirty approximation) without the color issues, so it's a similar situation.
With enough resolution in the most distorted regions of the map, I think re-aggregating Image should work reasonably well. But don't you have non-datashader ways to re-aggregate? The total amount of data won't be large at this point, and re-aggregating a grid is something useful with or without datashader.
With enough resolution in the most distorted regions of the map, I think re-aggregating Image should work reasonably well. But don't you have non-datashader ways to re-aggregate? The total amount of data won't be large at this point, and re-aggregating a grid is something useful with or without datashader.
Yes, I'd like there to be but I've not found a good solution that doesn't rely on some other library. I still don't quite understand how cartopy does it, and my current ProjectImage
operation already in geoviews is really quite poor imo, but it may be the case that's basically equivalent to the datashader based approach I'm proposing.
I don't think the different possible implementations will vary in quality by much, only in speed, so I'd vote for just using your current implementation unless we find it's very slow. Rasterio offers regridding, and if you're going to rely on an external library I'd use that, as it's what datashader uses anyway.
Hi guys - has there been any update on this stuff? We're just embarking on a big Notebook + Geoviews type project for our forecasters and it would be great if we could use datashader like this
Not yet, that said it wouldn't be all that much effort, so I'd be happy to prioritize this. It's not been a huge priority so far because usually we simply project the data before datashading, which means the resulting Image or RGB doesn't have to be projected again.
To expand a little bit, you can currently use the gv.operation.project
operation to project data from a source crs
to a target crs
before applying the datashade operation. The actual issue is that the output of the datashade
operation will be a HoloViews Image/RGB element which does not know anything about coordinate systems (and therefore cannot be automatically projected when you try to plot it on a different projection than it is already using). It should be easy enough to extend datashade
to find the crs
of the input and use that to construct a GeoViews Image/RGB on the output. It's also worth noting that this limitation only applies when you are datashading Points
or Paths
, the regrid
operation which regrids Image/RGBs does retain the crs
information.
I see - I'd slightly misunderstood - thanks for the clarification. So it is possible to plot geospatial stuff with data-shader, just the API isn't particularly natural at the minute.
We're trying to plot very high resolution geospatial data (Iris Cube) using geoviews and its taking a long time. Personally I'd love a datashader=True
kwarg in image plots, but perhaps that's a bit simplistic.
Oh yes, Datashader is definitely the way to go with geospatial stuff in GeoViews. We desperately need to get some time to improve our documentation to make that clear. That's a project already underway, but not yet delivering.
The API isn't too far from what you suggest already: as of the current HoloViews, GeoViews, and Datashader releases, to use datashader to aggregate your HoloViews or GeoViews object obj
into a grid, you either do aggregate(obj)
(for non-gridded objs like Points, Curve, Graph and containers thereof), or regrid(obj)
(for already-gridded objs like Image, and containers thereof). If you want datashader to aggregate and shade (colormap) the object, you'd do shade(aggregate(obj))
or the shortcut datashade(obj)
(for non-gridded objs), or shade(regrid(obj))
(for gridded objs). In each case, these are dynamic operations that will automatically re-run datashader on the data before it is sent to the browser, thereby ensuring that all calculations are done quickly in Python and Numba, not slowly in the browser. So they basically work like datashade==True
, but without having to hack in datashader support to the existing object types directly.
Having regrid be separate from the other datashader operations is just a historical artifact, and we are planning to unify the operators so that you don't have to worry about the distinction between already-gridded and not-yet-gridded data. Before we do that we need to unify some of the underlying syntax between the options accepted by regrid()
and aggregate()
; again those differ for historical rather than deep reasons. But the current API isn't too bad, just not as clean as I would like.
BTW, here we're only talking about regular grids, but note that we've just added support in Datashader for irregular grids (triangular meshes) , to be merged soon, in case those come up in any of your workflows. Again, I'd like that to be unified with the existing datashader operations in HoloViews and GeoViews, so that you just call those same operations for irregular-mesh data as well.
Thanks for the detailed summary @jbednar, just to be clear plotting very high resolution iris cubes is already possible if you do something like this (adjust for the lat/lon dimensions of your data):
from holoviews.operation.datashader import regrid
ds = gv.Dataset(iris_cube)
regrid(ds.to(gv.Image, ['Longitude', 'Latitude']))
The bottleneck in this approach will be projecting the image, which relies on cartopy's image projecting routines. If you don't need that you can also just use hv.Image
in the code above, which should then be very fast indeed.
Thanks guys - I was using shade
rather than regrid
- I'm getting something that looks more sensible now! Having this working on irregular grids is very cool - it's not something I'd be using right now, but it always comes up sooner or later.
I don't suppose there is anyway to make this cache the plots is there? Using datashader
plots the data faster initially but is doesn't seem to cache the plots, making my nice geoviews
slider laggy.
btw I appreciate this isn't the datashader+geoviews help forum so just point me somewhere else if you'd rather I dind't pollute your thread.
Thanks a lot for the above information, it's really helpful.
When I use regrid
with the matplotlib backend the above code works as I'd expect. But with the bokeh backend I get DataError: None of the available storage backends were able to support the supplied data format.
i.e. given the setup:
# <iris 'Cube' of air_temperature / (K) (latitude: 1200; longitude: 1600)>
kdims = ['longitude', 'latitude']
vdims = ['air_temperature']
ds = gv.Dataset(cube, kdims=kdims, vdims=vdims)
gv_img = ds.to(gv.Image, kdims)
hv_img = ds.to(hv.Image, kdims)
These will plot:
hv.notebook_extension('bokeh')
regrid(hv_img)
hv.notebook_extension('matplotlib')
regrid(gv_img)
But this throws the exception:
hv.notebook_extension('bokeh')
regrid(gv_img)
My understanding is that I can only add gf.coastline
when using gv.Image
, so this means that when using datashader I need to choose between an interactive bokeh plot and one with coastlines.
Does this seem correct? If there's anything I can do to help that'd be great, I'd love to have interactive + geospatially aware datashader plots.
When I use regrid with the matplotlib backend the above code works as I'd expect. But with the bokeh backend I get DataError: None of the available storage backends were able to support the supplied data format.
That's very odd indeed, the plotting backends should have no effect on the data.
My understanding is that I can only add gf.coastline when using gv.Image, so this means that when using datashader I need to choose between an interactive bokeh plot and one with coastlines.
That's sort of true but not quite, if you project your data to Mercator coordinates using gv.operation.project_image
before you regrid you should be able to use gf.coastline
just fine.
Is the file you're using something that you can share? I'd like to look into the DataError.
Thanks, here's the slice of the dataset I'm using: https://s3-eu-west-1.amazonaws.com/geoviews-sample/air_temperature.nc. Can provide more if you want.
My understanding is that I can only add gf.coastline when using gv.Image, so this means that when using datashader I need to choose between an interactive bokeh plot and one with coastlines.
That's sort of true but not quite, if you project your data to Mercator coordinates using gv.operation.project_image before you regrid you should be able to use gf.coastline just fine.
Aha! This is excellent, thanks.
regrid(gv_img)
fails as before, but regrid(new_img)
works perfectly
Having the same issue, matplotlib
works but bokeh
gives error:
1. Create an xarray from a geotiff
filename = '../path/to/file/name.tif`
xarray = xr.open_rasterio(filename).load()
xarray
:
<xarray.DataArray (band: 4, y: 4569, x: 9087)>
array([[[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0],
...,
[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0]],
[[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0],
...,
[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0]],
[[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0],
...,
[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0]],
[[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0],
...,
[0, 0, ..., 0, 0],
[0, 0, ..., 0, 0]]], dtype=uint16)
Coordinates:
* band (band) int64 1 2 3 4
* y (y) float64 2.57e+06 2.57e+06 2.57e+06 2.57e+06 2.57e+06 ...
* x (x) float64 3.612e+05 3.612e+05 3.613e+05 3.613e+05 3.613e+05 ...
Attributes:
crs: +init=epsg:32617
res: (3.0, 3.0)
is_tiled: 1
transform: (3.0, 0.0, 361245.0, 0.0, -3.0, 2569614.0, 0.0, 0.0, 1.0)
2. Create a gv.dataset
:
gv_dataset = gv.Dataset(xarray, kdims=['band','x','y'], vdims=['Raster'] , crs=ccrs.UTM(17))
gv_dataset
: :Dataset [band,x,y] (Raster)
3. Create a gv.image
:
image = regrid(gv_dataset.to(gv.Image, kdims=['x','y']))
4. Render the image
Render the image with matplotlib
:
hv.notebook_extension('matplotlib')
image
Render the image with bokeh
:
hv.notebook_extension('bokeh')
image
DataError: None of the available storage backends were able to support the supplied data format.
Full error:
---------------------------------------------------------------------------
DataError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/IPython/core/formatters.py in __call__(self, obj)
330 pass
331 else:
--> 332 return printer(obj)
333 # Finally look for special method names
334 method = get_real_method(obj, self.print_method)
/usr/local/lib/python3.6/dist-packages/holoviews/ipython/display_hooks.py in pprint_display(obj)
257 if not ip.display_formatter.formatters['text/plain'].pprint:
258 return None
--> 259 return display(obj, raw=True)
260
261
/usr/local/lib/python3.6/dist-packages/holoviews/ipython/display_hooks.py in display(obj, raw, **kwargs)
243 elif isinstance(obj, (HoloMap, DynamicMap)):
244 with option_state(obj):
--> 245 html = map_display(obj)
246 else:
247 return repr(obj) if raw else IPython.display.display(obj, **kwargs)
/usr/local/lib/python3.6/dist-packages/holoviews/ipython/display_hooks.py in wrapped(element)
130 try:
131 html = fn(element,
--> 132 max_frames=OutputSettings.options['max_frames'])
133
134 # Only want to add to the archive for one display hook...
/usr/local/lib/python3.6/dist-packages/holoviews/ipython/display_hooks.py in map_display(vmap, max_frames)
198 return None
199
--> 200 return render(vmap)
201
202
/usr/local/lib/python3.6/dist-packages/holoviews/ipython/display_hooks.py in render(obj, **kwargs)
60 if renderer.fig == 'pdf':
61 renderer = renderer.instance(fig='png')
---> 62 return renderer.html(obj, **kwargs)
63
64
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/renderer.py in html(self, obj, fmt, css, comm, **kwargs)
255 code to initialize a Comm, if the plot supplies one.
256 """
--> 257 plot, fmt = self._validate(obj, fmt)
258 figdata, _ = self(plot, fmt, **kwargs)
259 if css is None: css = self.css
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/renderer.py in _validate(self, obj, fmt)
191 if isinstance(obj, tuple(self.widgets.values())):
192 return obj, 'html'
--> 193 plot = self.get_plot(obj, renderer=self)
194
195 fig_formats = self.mode_formats['fig'][self.mode]
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/bokeh/renderer.py in get_plot(self_or_cls, obj, doc, renderer)
114 combining the bokeh model with another plot.
115 """
--> 116 plot = super(BokehRenderer, self_or_cls).get_plot(obj, renderer)
117 if self_or_cls.mode == 'server' and doc is None:
118 doc = curdoc()
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/renderer.py in get_plot(self_or_cls, obj, renderer)
178 plot = self_or_cls.plotting_class(obj)(obj, renderer=renderer,
179 **plot_opts)
--> 180 plot.update(0)
181 else:
182 plot = obj
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/plot.py in update(self, key)
484 if len(self) == 1 and key == 0 and not self.drawn:
485 return self.initialize_plot()
--> 486 item = self.__getitem__(key)
487 self.traverse(lambda x: setattr(x, '_updated', True))
488 return item
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/plot.py in __getitem__(self, frame)
219 if isinstance(frame, int) and frame > len(self):
220 self.warning("Showing last frame available: %d" % len(self))
--> 221 if not self.drawn: self.handles['fig'] = self.initialize_plot()
222 if not isinstance(frame, tuple):
223 frame = self.keys[frame]
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/bokeh/element.py in initialize_plot(self, ranges, plot, plots, source)
795 self.handles['plot'] = plot
796
--> 797 self._init_glyphs(plot, element, ranges, source)
798 if not self.overlaid:
799 self._update_plot(key, plot, style_element)
/usr/local/lib/python3.6/dist-packages/holoviews/plotting/bokeh/element.py in _init_glyphs(self, plot, element, ranges, source)
746 else:
747 style = self.style[self.cyclic_index]
--> 748 data, mapping, style = self.get_data(element, ranges, style)
749 current_id = element._plot_id
750 if source is None:
/usr/local/lib/python3.6/dist-packages/geoviews/plotting/bokeh/plot.py in get_data(self, element, ranges, style)
68 def get_data(self, element, ranges, style):
69 if self._project_operation and self.geographic and element.crs != DEFAULT_PROJ:
---> 70 element = self._project_operation(element)
71 return super(GeoPlot, self).get_data(element, ranges, style)
72
/usr/local/lib/python3.6/dist-packages/holoviews/core/operation.py in __call__(self, element, **params)
139 operation=self, kwargs=params)
140 elif isinstance(element, ViewableElement):
--> 141 processed = self._process(element)
142 elif isinstance(element, DynamicMap):
143 if any((not d.values) for d in element.kdims):
/usr/local/lib/python3.6/dist-packages/geoviews/operation.py in _process(self, img, key)
147 bounds = (extents[0], extents[2], extents[1], extents[3])
148 return img.clone(data, bounds=bounds, kdims=img.kdims,
--> 149 vdims=img.vdims, crs=proj)
150
151
/usr/local/lib/python3.6/dist-packages/geoviews/element/geo.py in clone(self, data, shared_data, new_type, *args, **overrides)
95 overrides['crs'] = self.crs
96 return super(_Element, self).clone(data, shared_data, new_type,
---> 97 *args, **overrides)
98
99
/usr/local/lib/python3.6/dist-packages/holoviews/core/dimension.py in clone(self, data, shared_data, new_type, *args, **overrides)
570 # Apply name mangling for __ attribute
571 pos_args = getattr(self, '_' + type(self).__name__ + '__pos_params', [])
--> 572 return clone_type(data, *args, **{k:v for k,v in settings.items()
573 if k not in pos_args})
574
/usr/local/lib/python3.6/dist-packages/geoviews/element/geo.py in __init__(self, data, **kwargs)
87 elif crs:
88 kwargs['crs'] = crs
---> 89 super(_Element, self).__init__(data, **kwargs)
90
91
/usr/local/lib/python3.6/dist-packages/holoviews/element/raster.py in __init__(self, data, kdims, vdims, bounds, extents, xdensity, ydensity, **params)
235 or (isinstance(data, np.ndarray) and data.size == 0)):
236 data = np.zeros((2, 2))
--> 237 Dataset.__init__(self, data, kdims=kdims, vdims=vdims, extents=extents, **params)
238
239 dim2, dim1 = self.interface.shape(self, gridded=True)[:2]
/usr/local/lib/python3.6/dist-packages/holoviews/core/data/__init__.py in __init__(self, data, kdims, vdims, **kwargs)
188
189 initialized = Interface.initialize(type(self), data, kdims, vdims,
--> 190 datatype=kwargs.get('datatype'))
191 (data, self.interface, dims, extra_kws) = initialized
192 validate_vdims = kwargs.pop('_validate_vdims', True)
/usr/local/lib/python3.6/dist-packages/holoviews/core/data/interface.py in initialize(cls, eltype, data, kdims, vdims, datatype)
204 % (intfc.__name__, e))
205 error = ' '.join([error, priority_error])
--> 206 raise DataError(error)
207
208 return data, interface, dims, extra_kws
DataError: None of the available storage backends were able to support the supplied data format.
This has now been implemented, datashader operations will now respect the coordinate reference system of the inputs. Should make things a lot more intuitive, we should however add a guide on working with Datashader and GeoViews.
Datashader is very useful when working with geospatial data and holoviews now offers datashader operations. One major issue is that these operations do not currently handle a
crs
, which makes it difficult to work with them. The main obstacle here is that a) there is no geoviewsRGB
type, which could be used to hold the output ofdatashade
and b) that projecting on every datashader plot update is very inefficient. I'd therefore suggest a temporary approach to make this work is to offer an operation that projects any Element type to a newcrs
and secondly to add a crs-aware RGB type to geoviews.