ropensci / stplanr

Sustainable transport planning with R
https://docs.ropensci.org/stplanr
Other
416 stars 66 forks source link

Bug in `line_segment()` when using certain values on projected data with `rsgeo` implementation #546

Closed Robinlovelace closed 9 months ago

Robinlovelace commented 9 months ago

As demonstrated below.

# Test locally:
# setwd("~/github/ropensci/stplanr")
# devtools::load_all()

# Test on lastest version
remotes::install_dev("stplanr")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'stplanr' from a github remote, the SHA1 (69486ee4) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(stplanr)

rnet_y = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_y_ed.geojson")
rnet_y_projected = sf::st_transform(rnet_y, "EPSG:27700")
summary(sf::st_length(rnet_y_projected))
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#>   0.6245  11.7269  27.1195  43.1607  56.9653 559.1134
rnet_y_projected_seg = line_segment(rnet_y_projected, segment_length = 10)
rnet_y_projected_seg = line_segment(rnet_y_projected, segment_length = 20)
#> Error in `[[<-`:
#> ! Assigned data `res` must be compatible with existing data.
#> ✖ Existing data has 32032 rows.
#> ✖ Assigned data has 32029 rows.
#> ℹ Only vectors of size 1 are recycled.
#> Caused by error in `vectbl_recycle_rhs_rows()`:
#> ! Can't recycle input of size 32029 to size 32032.
#> Backtrace:
#>      ▆
#>   1. ├─stplanr::line_segment(rnet_y_projected, segment_length = 20)
#>   2. ├─stplanr:::line_segment.sf(rnet_y_projected, segment_length = 20)
#>   3. │ ├─base::`[[<-`(`*tmp*`, attr(l, "sf_column"), value = `<LINESTRING [m]>`)
#>   4. │ └─tibble:::`[[<-.tbl_df`(`*tmp*`, attr(l, "sf_column"), value = `<LINESTRING [m]>`)
#>   5. │   └─tibble:::tbl_subassign(...)
#>   6. │     └─tibble:::vectbl_recycle_rhs_rows(value, fast_nrow(xo), i_arg = NULL, value_arg, call)
#>   7. │       ├─base::withCallingHandlers(...)
#>   8. │       └─vctrs::vec_recycle(value[[j]], nrow)
#>   9. └─vctrs:::stop_recycle_incompatible_size(...)
#>  10.   └─vctrs:::stop_vctrs(...)
#>  11.     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

Created on 2023-10-01 with reprex v2.0.2

Thoughts @JosiahParry ?

Robinlovelace commented 9 months ago

Seems that it fails here:

https://github.com/ropensci/stplanr/blob/69486ee4bc42720ccea7db24c4a3e1c685f8e997/R/linefuns.R#L229

Robinlovelace commented 9 months ago

After adding browser() in the commit above it's clear that this issue is due to the following:

[1] 1 1 1 1 1 1
Browse[1]> length(res)
[1] 12565
Browse[1]> nrow(res_tbl)
[1] 12584
Robinlovelace commented 9 months ago

A wider problem with the implementation: it outputs the same number of lengths 'sublines' regardless of the length of the original line in the vectorised version:

Browse[2]> segment_length = 10
Browse[2]> n_segments <- max(round(l_length / segment_length), 1)
Browse[2]> n_segments
[1] 56
Browse[2]> # segmentize the line strings
Browse[2]>       res <- rsgeo::line_segmentize(geo, n_segments)
Browse[2]> # make them into sfc_LINESTRING
Browse[2]>       res <- sf::st_cast(sf::st_as_sfc(res), "LINESTRING")
Browse[2]> summary(sf::st_length(res))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.01115 0.20942 0.48429 0.77075 1.01727 9.98451 
Browse[2]> res
Geometry set for 64064 features 
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 324174.3 ymin: 673163.9 xmax: 326361.4 ymax: 674519.1
CRS:           NA
First 5 geometries:
LINESTRING (324797.2 673311.6, 324797.3 673311.6)
LINESTRING (324797.3 673311.6, 324797.3 673311.6)
LINESTRING (324797.3 673311.6, 324797.4 673311.7)
LINESTRING (324797.4 673311.7, 324797.4 673311.7)
LINESTRING (324797.4 673311.7, 324797.4 673311.7)
Browse[2]> # give them them CRS
Browse[2]>       res <- sf::st_set_crs(res, crs)
Browse[2]> # calculate the number of original geometries
Browse[2]>       n <- length(geo)
Browse[2]> # create index ids to grab rows from
Browse[2]>       ids <- rep.int(1:n, rep(n_segments, n))
Browse[2]> res_tbl <- sf::st_drop_geometry(l)[ids,]
Browse[2]> # assign the geometry column
Browse[2]>       res_tbl[[attr(l, "sf_column")]] <- res
Browse[2]> res_final = sf::st_as_sf(res_tbl)
Browse[2]> nrow(res_final) / nrow(l)
[1] 56
Browse[2]> names(res_final)
[1] "value"     "Quietness" "geometry" 
Browse[2]> res_final
Simple feature collection with 64064 features and 2 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 324174.3 ymin: 673163.9 xmax: 326361.4 ymax: 674519.1
Projected CRS: OSGB36 / British National Grid
# A tibble: 64,064 × 3
   value Quietness                               geometry
   <dbl>     <dbl>                       <LINESTRING [m]>
 1     0        50 (324797.2 673311.6, 324797.3 673311.6)
 2     0        50 (324797.3 673311.6, 324797.3 673311.6)
 3     0        50 (324797.3 673311.6, 324797.4 673311.7)
 4     0        50 (324797.4 673311.7, 324797.4 673311.7)
 5     0        50 (324797.4 673311.7, 324797.4 673311.7)
 6     0        50 (324797.4 673311.7, 324797.5 673311.7)
 7     0        50 (324797.5 673311.7, 324797.5 673311.7)
 8     0        50 (324797.5 673311.7, 324797.6 673311.8)
 9     0        50 (324797.6 673311.8, 324797.6 673311.8)
10     0        50 (324797.6 673311.8, 324797.7 673311.8)
# ℹ 64,054 more rows
# ℹ Use `print(n = ...)` to see more rows
Browse[2]> summary(sf::st_length(res_final))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.01115 0.20942 0.48429 0.77075 1.01727 9.98451 
Browse[2]> summary(sf::st_length(l))
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
  0.6246  11.7273  27.1204  43.1622  56.9671 559.1324 
Robinlovelace commented 9 months ago

Update on this: latest commit in #548 diagnoses the line in rnet_y that is failing: n. 108:

Processing row 108 of 1144
  |+++++                                             | 9 % ~13s          Processing row 109 of 1144
Error in `[[<-` at stplanr/R/linefuns.R:364:2:
! Assigned data `res` must be compatible with existing data.
✖ Existing data has 4 rows.
✖ Assigned data has 3 rows.
ℹ Only vectors of size 1 are recycled.
Caused by error in `vectbl_recycle_rhs_rows()`:
! Can't recycle input of size 3 to size 4.
Run `rlang::last_trace()` to see where the error occurred.
Robinlovelace commented 9 months ago

Minimal example for ya @JosiahParry

library(stplanr)
failing_line = structure(list(structure(c(324957.69921197, 324957.873557727, 
324959.863123514, 324961.852683597, 324963.822867622, 324969.636546456, 
324976.718443977, 324996.443964294, 673670.123131518, 673680.139281405, 
673686.784106964, 673693.428933452, 673698.960855279, 673709.992098018, 
673722.114520549, 673742.922904206), dim = c(8L, 2L), class = c("XY", 
"LINESTRING", "sfg"))), class = c("sfc_LINESTRING", "sfc"), precision = 0, bbox = structure(c(xmin = 324957.69921197, 
ymin = 673670.123131518, xmax = 324996.443964294, ymax = 673742.922904206
), class = "bbox"), n_empty = 0L)
sf::st_crs(failing_line) = "EPSG:27700"
line_segment(failing_line, segment_length = 20)
#> Error in line_segment.sfc_LINESTRING(failing_line, segment_length = 20): object 'segment' not found

Created on 2023-10-01 with reprex v2.0.2

Robinlovelace commented 9 months ago

The {rsgeo} bug:

failing_line = structure(list(structure(c(324957.69921197, 324957.873557727, 
324959.863123514, 324961.852683597, 324963.822867622, 324969.636546456, 
324976.718443977, 324996.443964294, 673670.123131518, 673680.139281405, 
673686.784106964, 673693.428933452, 673698.960855279, 673709.992098018, 
673722.114520549, 673742.922904206), dim = c(8L, 2L), class = c("XY", 
"LINESTRING", "sfg"))), class = c("sfc_LINESTRING", "sfc"), precision = 0, bbox = structure(c(xmin = 324957.69921197, 
ymin = 673670.123131518, xmax = 324996.443964294, ymax = 673742.922904206
), class = "bbox"), n_empty = 0L)
sf::st_crs(failing_line) = "EPSG:27700"
line_segment(failing_line, segment_length = 20)
#> Error in line_segment(failing_line, segment_length = 20): could not find function "line_segment"

# Try with rsgeo: 
geo <- rsgeo::as_rsgeo(sf::st_geometry(failing_line))

  # segmentize the line strings
res <- rsgeo::line_segmentize(geo, 4)
res <- sf::st_cast(sf::st_as_sfc(res), "LINESTRING")
res
#> Geometry set for 3 features 
#> Geometry type: LINESTRING
#> Dimension:     XY
#> Bounding box:  xmin: 324957.7 ymin: 673670.1 xmax: 324996.4 ymax: 673742.9
#> CRS:           NA
#> LINESTRING (324957.7 673670.1, 324957.9 673680....
#> LINESTRING (324961.1 673690.9, 324961.9 673693....
#> LINESTRING (324969.8 673710.2, 324976.7 673722....
nrow(res) # should be 4
#> NULL
sf::st_length(res)
#> [1] 21.23587 21.23587 42.47173

Created on 2023-10-01 with reprex v2.0.2

JosiahParry commented 9 months ago

@Robinlovelace

For now, can you please install from github? This has been resolved. I need to make my PR to geo and push the patch to CRAN.

rnet_y = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_y_ed.geojson")

g <- rnet_y$geometry |> 
  sf::st_transform("EPSG:27700") 

stplanr::line_segment(g, 5)
#> Geometry set for 5720 features 
#> Geometry type: LINESTRING
#> Dimension:     XY
#> Bounding box:  xmin: 324174.3 ymin: 673163.4 xmax: 326361.3 ymax: 674518.6
#> Projected CRS: OSGB36 / British National Grid
#> First 5 geometries:
#> LINESTRING (324797.2 673311.1, 324797.7 673311.3)
#> LINESTRING (324797.7 673311.3, 324798.2 673311.5)
#> LINESTRING (324798.2 673311.5, 324798.7 673311.8)
#> LINESTRING (324798.7 673311.8, 324799.2 673312)
#> LINESTRING (324799.2 673312, 324799.7 673312.2)

Created on 2023-10-01 with reprex v2.0.2

Robinlovelace commented 9 months ago

For now, can you please install from github?

Can do, will add as Remotes. Will keep this issue open until the next CRAN release of rsgeo after which will be a good time to do a CRAN release of stplanr.