r-spatial / lwgeom

bindings to the liblwgeom library
https://r-spatial.github.io/lwgeom/
58 stars 23 forks source link

preallocate CharacterVector in st_astext to improve performance on large data #74

Closed MilesMcBain closed 2 years ago

MilesMcBain commented 2 years ago

st_astext is currently growing Rcpp CharacterVectors which seems to incur quite a larger time and memory penalty on larger datasets.

Here is an example of the performance without this change:

poly <- sf::st_as_sfc("POLYGON((0 0,0.5 0,0.5 0.5,0.5 0,1 0,1 1,0 1,0 0))")
  polys <- rep(poly, 100000)
  bench::mark(
    lwgeom::st_astext(polys)
  )
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 1 x 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 lwgeom::st_astext(polys)    1.33m    1.33m    0.0125    37.3GB     3.17

Created on 2021-09-30 by the reprex package (v2.0.0)

Here is the performance with this change:

  devtools::load_all()
#> i Loading lwgeom
#> Linking to liblwgeom 3.0.0beta1 r16016, GEOS 3.9.1, PROJ 7.2.1
  poly <- sf::st_as_sfc("POLYGON((0 0,0.5 0,0.5 0.5,0.5 0,1 0,1 1,0 1,0 0))")
  polys <- rep(poly, 100000)
  bench::mark(
    lwgeom::st_astext(polys)
  )
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 1 x 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 lwgeom::st_astext(polys)    935ms    935ms      1.07    23.7MB     1.07

Created on 2021-09-30 by the reprex package (v2.0.0)

MilesMcBain commented 2 years ago

oops this got messed up by a amend on my end. I'll open a new one.