darcy-r / geoparquet-python

API between Parquet files and GeoDataFrames for fast input/output of GIS data. // This project was a proof of concept. For current development please see https://github.com/geopandas/geo-arrow-spec
MIT License
24 stars 1 forks source link

Cannot write gdf into gpq #1

Open tiffanychu90 opened 5 years ago

tiffanychu90 commented 5 years ago

I have a GeoDataFrame that I wanted to write as a GeoParquet. After reading in the gdf, I immediately want to write a gpq file, but get an error.

df.to_parquet('../gis/final/TEST.geoparquet')


ArrowInvalid Traceback (most recent call last)

in ----> 1 df.to_parquet('../gis/final/TEST.geoparquet') /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in to_parquet(self, fname, engine, compression, index, partition_cols, **kwargs) 2201 to_parquet(self, fname, engine, 2202 compression=compression, index=index, -> 2203 partition_cols=partition_cols, **kwargs) 2204 2205 @Substitution(header='Whether to print column labels, default True') /opt/conda/lib/python3.7/site-packages/pandas/io/parquet.py in to_parquet(df, path, engine, compression, index, partition_cols, **kwargs) 250 impl = get_engine(engine) 251 return impl.write(df, path, compression=compression, index=index, --> 252 partition_cols=partition_cols, **kwargs) 253 254 /opt/conda/lib/python3.7/site-packages/pandas/io/parquet.py in write(self, df, path, compression, coerce_timestamps, index, partition_cols, **kwargs) 111 else: 112 from_pandas_kwargs = {'preserve_index': index} --> 113 table = self.api.Table.from_pandas(df, **from_pandas_kwargs) 114 if partition_cols is not None: 115 self.api.parquet.write_to_dataset( /opt/conda/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.Table.from_pandas() /opt/conda/lib/python3.7/site-packages/pyarrow/pandas_compat.py in dataframe_to_arrays(df, schema, preserve_index, nthreads, columns, safe) 466 arrays = [convert_column(c, t) 467 for c, t in zip(columns_to_convert, --> 468 convert_types)] 469 else: 470 from concurrent import futures /opt/conda/lib/python3.7/site-packages/pyarrow/pandas_compat.py in (.0) 465 if nthreads == 1: 466 arrays = [convert_column(c, t) --> 467 for c, t in zip(columns_to_convert, 468 convert_types)] 469 else: /opt/conda/lib/python3.7/site-packages/pyarrow/pandas_compat.py in convert_column(col, ty) 461 e.args += ("Conversion failed for column {0!s} with type {1!s}" 462 .format(col.name, col.dtype),) --> 463 raise e 464 465 if nthreads == 1: /opt/conda/lib/python3.7/site-packages/pyarrow/pandas_compat.py in convert_column(col, ty) 455 def convert_column(col, ty): 456 try: --> 457 return pa.array(col, type=ty, from_pandas=True, safe=safe) 458 except (pa.ArrowInvalid, 459 pa.ArrowNotImplementedError, /opt/conda/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib.array() /opt/conda/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib._ndarray_to_array() /opt/conda/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status() ArrowInvalid: ('Could not convert POINT (-118.4298903 34.179617) with type Point: did not recognize Python value type when inferring an Arrow data type', 'Conversion failed for column geom with type object')