mrocklin / dask-geopandas

Parallelized GeoPandas with Dask
107 stars 18 forks source link

AttributeError("module 'geopandas' has no attribute 'vectorized'",) #2

Open KarenChen9999 opened 6 years ago

KarenChen9999 commented 6 years ago

Hi,

I have installed geopandas-cython, dask, dask_geopandas. I am trying to set geometry using dask_geopandas.

crs={'ellps': 'GRS80', 'no_defs': True, 'proj': 'longlat'}
df = dd.read_csv(address_path+"test2.csv")
gf = df.set_geometry(df[['long', 'lat']], crs=crs)

But I got the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask/dataframe/utils.py in raise_on_meta_error(funcname)
    136     try:
--> 137         yield
    138     except Exception as e:

/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask/dataframe/core.py in _emulate(func, *args, **kwargs)
   3289     with raise_on_meta_error(funcname(func)):
-> 3290         return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
   3291 

/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask_geopandas/core.py in _points_from_xy(x, y, crs)
    427 def _points_from_xy(x, y, crs=None):
--> 428     points = gpd.vectorized.points_from_xy(x.values, y.values)
    429     return gpd.GeoSeries(points, index=x.index, crs=crs)

AttributeError: module 'geopandas' has no attribute 'vectorized'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-2-e808c7abea15> in <module>()
      4 crs={'ellps': 'GRS80', 'no_defs': True, 'proj': 'longlat'}
      5 df = dd.read_csv(address_path+"test2.csv")
----> 6 gf = df.set_geometry(df[['long', 'lat']], crs=crs)

/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask_geopandas/core.py in set_geometry(df, geometry, crs)
    439     if isinstance(geometry, dd.DataFrame) and len(geometry.columns) == 2:
    440         a, b = geometry.columns
--> 441         geometry = points_from_xy(geometry[a], geometry[b], crs=crs)
    442 
    443     assert df.npartitions == geometry.npartitions

/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask_geopandas/core.py in points_from_xy(x, y, crs)
    431 
    432 def points_from_xy(x, y, crs=None):
--> 433     s = dd.map_partitions(_points_from_xy, x, y, crs=crs)
    434     example = gpd.GeoSeries(Point(0, 0))
    435     return GeoSeries(s.dask, s._name, example, [all_space] * s.npartitions)

/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask/dataframe/core.py in map_partitions(func, *args, **kwargs)
   3322 
   3323     if meta is no_default:
-> 3324         meta = _emulate(func, *args, **kwargs)
   3325 
   3326     if all(isinstance(arg, Scalar) for arg in args):

/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask/dataframe/core.py in _emulate(func, *args, **kwargs)
   3288     """
   3289     with raise_on_meta_error(funcname(func)):
-> 3290         return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
   3291 
   3292 

/apps/anaconda3/envs/dask/lib/python3.6/contextlib.py in __exit__(self, type, value, traceback)
     97                 value = type()
     98             try:
---> 99                 self.gen.throw(type, value, traceback)
    100             except StopIteration as exc:
    101                 # Suppress StopIteration *unless* it's the same exception that

/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask/dataframe/utils.py in raise_on_meta_error(funcname)
    148                ).format(" in `{0}`".format(funcname) if funcname else "",
    149                         repr(e), tb)
--> 150         raise ValueError(msg)
    151 
    152 

ValueError: Metadata inference failed in `_points_from_xy`.

Original error is below:
------------------------
AttributeError("module 'geopandas' has no attribute 'vectorized'",)

Traceback:
---------
  File "/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask/dataframe/utils.py", line 137, in raise_on_meta_error
    yield
  File "/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask/dataframe/core.py", line 3290, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "/apps/anaconda3/envs/dask/lib/python3.6/site-packages/dask_geopandas/core.py", line 428, in _points_from_xy
    points = gpd.vectorized.points_from_xy(x.values, y.values)

Can you please advise what I can try to resolve the issue?

mrocklin commented 6 years ago

My guess is that this is because the geopandas-cython branch of geopandas has evolved since this proof-of-concept library was written. In general I would not expect a stable experience with any of these libraries for the near future.

mrocklin commented 6 years ago

I've added a note to the README about the current status of this project: https://github.com/mrocklin/dask-geopandas#status

jorisvandenbossche commented 6 years ago

There have been no changes as far as I am aware to the fact that there is a vectorized module (I am planning to do some reorganisation there, https://github.com/geopandas/geopandas/pull/662, but that has not yet been merged)

@KarenChen9999 Are you sure the geopandas-cython branch is installed correctly? If you do import geopandas and geopandas.__version__ in a console, what do you get? (it should be something like 1.0.0dev+... for the geopandas-cython branch)

jorisvandenbossche commented 6 years ago

BTW, you can install the geopandas-cython version also with conda:

conda install --channel conda-forge/label/dev geopandas
hafez-ahmad commented 4 years ago

AttributeError: module 'geopandas' has no attribute 'GesoSeries'

jorisvandenbossche commented 4 years ago

@hafez-ahmad a more recent effort to get a dask-enabled version of geopandas is currently being developed in https://github.com/jsignell/dask-geopandas. It might be a bit early, but you are certainly welcome to try that out!