ipeaGIT / gtfs2gps

Convert GTFS data into a data.table with GPS-like records in R
https://ipeagit.github.io/gtfs2gps/
Other
71 stars 10 forks source link

bounding box problem while converting gtfs data to sf format #20

Closed pedro-andrade-inpe closed 4 years ago

pedro-andrade-inpe commented 5 years ago

When converting GTFS data to sf format, the bounding box is wrong (therefore plotting does not work properly). This might indicate a problem in sf, but requires further investigation. For now, the code will convert to spatial and then to sf to fix this problem. See the code below to reproduce the problem. When the converted data is saved into a shp and then loaded again, the bounding box is correct.

require(magrittr)

gtfs <- gtfs2gps::read_gtfs(system.file("extdata/poa.zip", package="gtfs2gps"))
crs = 4326

temp_shapes <- gtfs$shapes[,
                           {
                             geometry <- sf::st_linestring(x = matrix(c(shape_pt_lon, shape_pt_lat), ncol = 2))
                             geometry <- sf::st_sfc(geometry)
                             geometry <- sf::st_sf(geometry = geometry)
                           }
                           , by = shape_id
                           ]

temp_shapes[, length := sf::st_length(geometry) %>% units::set_units("km"), by = shape_id]

result <- sf::st_as_sf(temp_shapes, crs = crs)

sf::st_write(result, "myresult.shp")
shp <- sf::st_read("myresult.shp")

sf::st_bbox(result)
sf::st_bbox(shp)

result %>% sf::as_Spatial() %>% sf::st_as_sf() %>% sf::st_bbox()
SymbolixAU commented 4 years ago

The issue is with

temp_shapes <- gtfs$shapes[,
                           {
                             geometry <- sf::st_linestring(x = matrix(c(shape_pt_lon, shape_pt_lat), ncol = 2))
                             geometry <- sf::st_sfc(geometry)
                             geometry <- sf::st_sf(geometry = geometry)
                           }
                           , by = shape_id
                           ]

Where it only calculates the bounding box for the first group (i.e, the first shape_id).

you can use sfheaders to get around this

sf <- sfheaders::sf_linestring(
  obj = gtfs$shapes
  , x = "shape_pt_lon"
  , y = "shape_pt_lat"
)

and once I've finished working on this issue it will keep all the properties too.

rafapereirabr commented 4 years ago

Thanks for the heads up, @SymbolixAU . I'm very glad to see you around here! I messaged Dave Cooley over Twitter the other day and it seems like we could use sfheaders to improve our code speed. We will definitely give it a try.

SymbolixAU commented 4 years ago

Just ping this account or @dcooley if you have any questions (A lot of the time we're the same person :) )

rafapereirabr commented 4 years ago

I've added sfheaders as a package dependency to deal with the conversion between data.frames and sf. See commit #81d1ce1a08d510aa7842d8ae5275fbcc22e243cc. This issue can be closed now.

new <- gtfs2gps::gtfs_shapes_as_sf(gtfs)

sf::st_bbox(shp)
sf::st_bbox(new)