geoarrow / geoarrow-python

Python implementation of the GeoArrow specification
http://geoarrow.org/geoarrow-python/
Apache License 2.0
59 stars 3 forks source link

fix(geoarrow-pyarrow): Ensure/prefer ISO WKB where possible #39

Closed paleolimbot closed 9 months ago

paleolimbot commented 9 months ago

On my computer this is about 20ms per million points (to verify that all items in an ISO WKB array are actually ISO WKB encoded).

import pyarrow as pa
import geoarrow.pyarrow as ga
import numpy as np

n = int(1e6)
xs = np.random.random(n)
ys = np.random.random(n)
zs = np.random.random(n)
points = ga.point().with_dimensions(ga.Dimensions.XYZ).from_geobuffers(None, xs, ys, zs)
gp = ga.to_geopandas(points)
ewkb = pa.array(gp.to_wkb())
not_ewkb = ga.as_wkb(ewkb, strict_iso_wkb=True)

# Best case detection
%timeit ga._compute._any_ewkb(ewkb)
#> 35.9 µs ± 215 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
# Worst case detection (can probably optimize better in geoarrow-c)
%timeit ga._compute._any_ewkb(not_ewkb)
#> 20.1 ms ± 91.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# Comes out about even to fix it
%timeit ga.as_wkb(not_ewkb, strict_iso_wkb=True)
#> 20.1 ms ± 41.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit ga.as_wkb(ewkb, strict_iso_wkb=True)
#> 23.8 ms ± 116 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
codecov[bot] commented 9 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (f914349) 95.09% compared to head (f6e31ea) 95.56%. Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #39 +/- ## ========================================== + Coverage 95.09% 95.56% +0.47% ========================================== Files 10 10 Lines 1426 1444 +18 ========================================== + Hits 1356 1380 +24 + Misses 70 64 -6 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.