holoviz / geoviews

Simple, concise geographical visualization in Python
http://geoviews.org
BSD 3-Clause "New" or "Revised" License
584 stars 75 forks source link

Multiple issues when making stuff from geopandas GeoDataFrame #212

Closed krvkir closed 5 years ago

krvkir commented 6 years ago

I noticed strange behavior of things created from geodataframes. It seems to be inconsistent with the stuff created from regular pandas dataframes. Also, it worked fine with one of the previous versions of geoviews (about half a year ago).

Here is the sample code to reproduce the issue:

In [1]: import numpy as np
   ...: import pandas as pd
   ...: import geopandas as gpd
   ...: 
   ...: from shapely.geometry import Point
   ...: 
   ...: import holoviews as hv
   ...: import geoviews as gv
   ...: hv.extension('bokeh')
   ...: 
   ...: map_background = gv.WMTS('https://maps.wikimedia.org/osm-intl/{Z}/{X}/{Y}@2x.png')
   ...: 
   ...: 

In [2]: # Prepare test data --- a geodataframe with points.
   ...: values = np.array(range(100))
   ...: lon = 37 + values / 1000
   ...: lat = 55 + values / 1000
   ...: geom = [Point(lon, lat) for lon, lat in zip(lon, lat)]
   ...: points = gpd.GeoDataFrame({'geometry': geom,
   ...:                            'value': values,
   ...:                            'longitude': lon,
   ...:                            'latitude': lat})
   ...: points.head(3)
   ...:                            
Out[2]: 
                geometry  value  longitude  latitude
0          POINT (37 55)      0     37.000    55.000
1  POINT (37.001 55.001)      1     37.001    55.001
2  POINT (37.002 55.002)      2     37.002    55.002

In [3]: # Make `Dataset` out of the geodataframe and make points out of it --- failure:
   ...: geods = gv.Dataset(points, vdims=['value'])
   ...: print(geods)
   ...: geods.to(gv.Points)
   ...: 
   ...: 
:Dataset   [Longitude,Latitude]   (value)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-caef3816a965> in <module>()
      2 geods = gv.Dataset(points, vdims=['value'])
      3 print(geods)
----> 4 geods.to(gv.Points)

/data/+Projects/.env/lib/python3.6/site-packages/geoviews/element/__init__.py in __call__(self, *args, **kwargs)
     37                 args = (Dataset,)
     38                 kwargs['kdims'] = []
---> 39         converted = super(GeoConversion, self).__call__(*args, **kwargs)
     40         if is_gpd:
     41             if kdims is None: kdims = group_type.kdims

/data/+Projects/.env/lib/python3.6/site-packages/holoviews/core/data/__init__.py in __call__(self, new_type, kdims, vdims, groupby, sort, **kwargs)
    167                 selected = self._element.clone(kdims=ds_kdims, vdims=ds_vdims)
    168             else:
--> 169                 selected = self._element.reindex(groupby+kdims, vdims)
    170         params = {'kdims': [selected.get_dimension(kd, strict=True) for kd in kdims],
    171                   'vdims': [selected.get_dimension(vd, strict=True) for vd in vdims],

/data/+Projects/.env/lib/python3.6/site-packages/holoviews/core/data/__init__.py in reindex(self, kdims, vdims)
    375             new_type = self._vdim_reductions.get(len(val_dims), type(self))
    376 
--> 377         data = self.interface.reindex(self, key_dims, val_dims)
    378         datatype = self.datatype
    379         if gridded and dropped:

AttributeError: type object 'GeoPandasInterface' has no attribute 'reindex'

In [4]: # But with regular dataframe this works:
   ...: ds = gv.Dataset(
   ...:     pd.DataFrame(points), 
   ...:     kdims=['longitude', 'latitude'],
   ...:     vdims=['value'])
   ...: print(ds)
   ...:     
:Dataset   [longitude,latitude]   (value)

In [5]: # We also can make `Points` right out of the geodataframe:
   ...: ele = gv.Points(points, vdims=['value'])

In [6]: 

In [6]: # We can select a slice from a dataset made out of dataframe:
   ...: ds.select(value=50)
Out[6]: :Dataset   [longitude,latitude]   (value)

In [7]: # But not out of geodataframe:
   ...: geods.select(value=50)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-d4f6baf2c31d> in <module>()
      1 # But not out of geodataframe:
----> 2 geods.select(value=50)

/data/+Projects/.env/lib/python3.6/site-packages/holoviews/core/data/__init__.py in select(self, selection_specs, **selection)
    340             return self
    341 
--> 342         data = self.interface.select(self, **selection)
    343 
    344         if np.isscalar(data):

/data/+Projects/.env/lib/python3.6/site-packages/holoviews/core/data/multipath.py in select(cls, dataset, selection_mask, **selection)
    121         Applies selectiong on all the subpaths.
    122         """
--> 123         if not dataset.data:
    124             return []
    125         ds = cls._inner_dataset_template(dataset)

/data/+Projects/.env/lib/python3.6/site-packages/pandas/core/generic.py in __nonzero__(self)
   1571         raise ValueError("The truth value of a {0} is ambiguous. "
   1572                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1573                          .format(self.__class__.__name__))
   1574 
   1575     __bool__ = __nonzero__

ValueError: The truth value of a GeoDataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Versions:

In [8]: gv.__version__
Out[8]: '1.5.3'

In [9]: gpd.__version__
Out[9]: '0.3.0'

For convenience, here's the notebook with this code.

2018-07-23 geoviews bugs.ipynb.gz

jbednar commented 6 years ago

My guess is that geopandas changed its API since that code was written. I can reproduce the first error; haven't tried the rest. Seems like we'll need to update the geoviews code for geopandas. You might try using an earlier version of geopandas for now?

krvkir commented 6 years ago

Thanks for the fast reply!

Luckily, I managed to rewrite my code in the way which avoids direct transformation from geodataframes to datasets (just fetched lon and lat from points to separate columns and converted geodataframe to regular dataframe, then declared those columns as kdims), so I'm not in a big hurry with this issue. But this way wouldn't help with geometries other than points, so the issue is still important.

philippjfr commented 5 years ago

I can't reproduce these issues on master.