Closed kcpevey closed 5 years ago
I think this is definitely very important, we've already tried to improve these in certain cases but there's definitely a lot more we can do, particularly in regard to polygon data.
Also, somewhat related, I'll say that its fairly annoying to have to feed the poly/point data into the drawing in one format, and get back from the drawing that same data in another format (i.e. list of list of tuples vs ordered dict).
Also, somewhat related, I'll say that its fairly annoying to have to feed the poly/point data into the drawing in one format, and get back from the drawing that same data in another format (i.e. list of list of tuples vs ordered dict).
I do agree with this, however while we accept the data in the "list of lists of tuples format" that is not a valid internal representation of the data, so even before you use any stream your data will have been converted to a different format. Polygon data is currently stored in one of the following ways:
1) List of dictionaries for each path 2) List of arrays for each path 3) List of dataframes for each path 4) A geopandas dataframe
By default it will convert the lists of lists of tuples to 1, 2, or 3, something that is unlikely to change.
Separately though, the drawing tool streams' .data
attribute currently returns data in the bokeh storage format which is different again and we should probably at least make that consistent.
Could we avoid this confusion by modifying our examples to use one of 1, 2, or 3 when the data is first fed in?
@jbednar I think that would help. I'm not sure where I got the "list of lists of tuples" format, but I found that it worked and ran with it. It actually almost always requires me to reformat my data INTO that format so I'm happy to know I can stop doing that. Taking that input format option away and being forced into providing input in 1, 2, 3 or 4 might be more helpful in the long run rather than allowing for my ignorance on the front end :)
I'm not sure where I got the "list of lists of tuples" format, but I found that it worked and ran with it.
I think we use it in a few places since it's the simplest Python literal format.
It actually almost always requires me to reformat my data INTO that format so I'm happy to know I can stop doing that.
Would be good to know what format your data usually comes in and we could consider supporting that.
Taking that input format option away and being forced into providing input in 1, 2, 3 or 4
For backward compatibility reasons this is unlikely to happen.
I don't think we need to prevent that format, just remove it from our examples if it's one that causes conceptual problems later, even if it seems convenient at the time.
Would be good to know what format your data usually comes in and we could consider supporting that.
My data comes in a variety of formats, generally np arrays and dataframes depending on the source. For example, I had been doing something like this with a simple two column csv file with x and y:
boundary_df = pd.read_csv('boundary.xy',delimiter='\t', header=None, names=['x','y'])
boundary_poly = [tuple(t) for t in zip(boundary_df.x.values, boundary_df.y.values)]
viz = PolyAndPointAnnotator(path_type=gv.Path, polys=[boundary_poly],
crs=coord_sys,extent=(np.NaN,)*4, tile_url='http://tile.stamen.com/terrain/{Z}/{X}/{Y}.jpg')
Which was quite messy.
That is indeed quite messy, I would expect this to work:
boundary_df = pd.read_csv('boundary.xy',delimiter='\t', header=None, names=['x','y'])
viz = PolyAndPointAnnotator(path_type=gv.Path, polys=[boundary_df],
crs=coord_sys,extent=(np.NaN,)*4, tile_url='http://tile.stamen.com/terrain/{Z}/{X}/{Y}.jpg')
::insert dramatic eye roll here:: That works beautifully. :)
Good to hear :-) On the inverse if trying to get the data back out from the stream, you should be able to fetch it in the same format using:
stream.element.split(datatype='dataframe')
That works great too! And it has the added benefit of consistent Latitude/Longitude labeling preventing this https://github.com/pyviz/EarthSim/issues/60
Is there any reason why the above suggestion:
boundary_df = pd.read_csv('boundary.xy',delimiter='\t', header=None, names=['x','y'])
viz = PolyAndPointAnnotator(path_type=gv.Path, polys=[boundary_df],
crs=coord_sys,extent=(np.NaN,)*4, tile_url='http://tile.stamen.com/terrain/{Z}/{X}/{Y}.jpg')
Would not work with with points? e.g.:
viz = PolyAndPointAnnotator(path_type=gv.Path, polys=[boundary_df], points=[boundary_df],
crs=coord_sys,extent=(np.NaN,)*4, tile_url='http://tile.stamen.com/terrain/{Z}/{X}/{Y}.jpg')
My current example, which is essentially the same as above, gives this error on the point specification:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-12-0a27f44fdd5e> in <module>()
5 poly_columns=[],
6 polys=[centerline_df],
----> 7 points=[reservoir_df])
8
c:\projects\ers\github\earthsim\master\earthsim\earthsim\annotators.py in __init__(self, poly_data, **params)
191
192 def __init__(self, poly_data={}, **params):
--> 193 super(PolyAnnotator, self).__init__(**params)
194 style = dict(editable=True)
195 plot = dict(width=self.width, height=self.table_height)
c:\projects\ers\github\earthsim\master\earthsim\earthsim\annotators.py in __init__(self, **params)
235 self.points = self.points.add_dimension(col, 0, None, True)
236 self.point_stream = PointDraw(source=self.points, data={})
--> 237 projected = gv.project(self.points, projection=ccrs.PlateCarree())
238 self.point_table = Table(projected).opts(plot=plot, style=style)
239 self.point_link = PointTableLink(source=self.points, target=self.point_table)
C:\ProgramData\Anaconda3\envs\earthsimAUG\lib\site-packages\param\parameterized.py in __new__(class_, *args, **params)
2047 inst = class_.instance()
2048 inst.param._set_name(class_.__name__)
-> 2049 return inst.__call__(*args,**params)
2050
2051 def __call__(self,*args,**kw):
c:\projects\ers\github\holoviews\master\holoviews\holoviews\core\operation.py in __call__(self, element, **params)
161 operation=self, kwargs=params)
162 elif isinstance(element, ViewableElement):
--> 163 processed = self._apply(element)
164 elif isinstance(element, DynamicMap):
165 if any((not d.values) for d in element.kdims):
c:\projects\ers\github\holoviews\master\holoviews\holoviews\core\operation.py in _apply(self, element, key)
119 for hook in self._preprocess_hooks:
120 kwargs.update(hook(self, element))
--> 121 ret = self._process(element, key)
122 for hook in self._postprocess_hooks:
123 ret = hook(self, ret, **kwargs)
C:\ProgramData\Anaconda3\envs\earthsimAUG\lib\site-packages\geoviews\operation\projection.py in _process(self, element, key)
411 for op in self._operations:
412 element = element.map(op.instance(projection=self.p.projection),
--> 413 op.supported_types)
414 return element
c:\projects\ers\github\holoviews\master\holoviews\holoviews\core\dimension.py in map(self, map_fn, specs, clone)
727 return deep_mapped
728 else:
--> 729 return map_fn(self) if applies else self
730
731
c:\projects\ers\github\holoviews\master\holoviews\holoviews\core\operation.py in __call__(self, element, **params)
161 operation=self, kwargs=params)
162 elif isinstance(element, ViewableElement):
--> 163 processed = self._apply(element)
164 elif isinstance(element, DynamicMap):
165 if any((not d.values) for d in element.kdims):
c:\projects\ers\github\holoviews\master\holoviews\holoviews\core\operation.py in _apply(self, element, key)
119 for hook in self._preprocess_hooks:
120 kwargs.update(hook(self, element))
--> 121 ret = self._process(element, key)
122 for hook in self._postprocess_hooks:
123 ret = hook(self, ret, **kwargs)
C:\ProgramData\Anaconda3\envs\earthsimAUG\lib\site-packages\geoviews\operation\projection.py in _process(self, element, key)
32
33 def _process(self, element, key=None):
---> 34 return element.map(self._process_element, self.supported_types)
35
36
c:\projects\ers\github\holoviews\master\holoviews\holoviews\core\dimension.py in map(self, map_fn, specs, clone)
727 return deep_mapped
728 else:
--> 729 return map_fn(self) if applies else self
730
731
C:\ProgramData\Anaconda3\envs\earthsimAUG\lib\site-packages\geoviews\operation\projection.py in _process_element(self, element)
191 xdim, ydim = element.dimensions()[:2]
192 xs, ys = (element.dimension_values(i) for i in range(2))
--> 193 coordinates = self.p.projection.transform_points(element.crs, xs, ys)
194 mask = np.isfinite(coordinates[:, 0])
195 new_data = {k: v[mask] for k, v in element.columns().items()}
lib\cartopy\_crs.pyx in cartopy._crs.CRS.transform_points()
ValueError: could not convert string to float: 'x'
My most recent comment above is related to Issue #133 .
I continue to run into DataError: None of the available storage backends were able to support the supplied data format.
occasionally, but perhaps less so now that annotators can accept all element types. I'm closing this. If it comes back up, we can always reopen.
When I feed polygons and points into an annotator to be included in the initial draw, I almost never format the data correctly. For example, polygons are something like a list of a list of tuples or a list of a list of lists. Anyway, its fine to require the data in a certain format, but the error message I get when I have incorrect formatting is not helpful for me. Its so deep inside holoviews I think it has something to do with the way I'm asking it to draw. It throws me off every time (and I've done this A LOT). I would humbly suggest some better feedback for improper formatting if its something that wouldn't be too challenging.