Closed paleolimbot closed 11 months ago
Merging #34 (649d7a8) into main (ff9771e) will increase coverage by
0.48%
. The diff coverage is99.34%
.
@@ Coverage Diff @@
## main #34 +/- ##
==========================================
+ Coverage 94.59% 95.07% +0.48%
==========================================
Files 10 10
Lines 1257 1401 +144
==========================================
+ Hits 1189 1332 +143
- Misses 68 69 +1
Files | Coverage Δ | |
---|---|---|
geoarrow-pyarrow/src/geoarrow/pyarrow/_type.py | 95.66% <100.00%> (+0.17%) |
:arrow_up: |
geoarrow-pyarrow/src/geoarrow/pyarrow/io.py | 99.32% <99.25%> (-0.68%) |
:arrow_down: |
Substantially faster than geopandas IO (just because it avoids converting to/from np.array(<shapely>)
):
from pyarrow import feather
import geopandas
import geoarrow.pyarrow.io as io
# curl -L "https://github.com/geoarrow/geoarrow-data/releases/download/v0.1.0/ns-water-water_line.arrow" -o ns-water-water_line.arrow
tab = feather.read_table("ns-water-water_line.arrow", columns=["geometry"])
df = tab.to_pandas()
df.geometry = df.geometry.geoarrow.to_geopandas()
df = geopandas.GeoDataFrame(df)
%timeit io.write_geoparquet_table(tab, "test.parquet")
365 ms ± 6.57 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit io.read_geoparquet_table("test.parquet")
205 ms ± 2.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit df.to_parquet("test.parquet")
1.42 s ± 25.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit geopandas.read_parquet("test.parquet")
941 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Adds direct Parquet to/from GeoArrow extension types (just in a Table, for now):