ropensci / osmdata

R package for downloading OpenStreetMap data
https://docs.ropensci.org/osmdata
316 stars 46 forks source link

[BUG] trim_osmdata trims too much, with exclude = FALSE it trims too little #276

Closed mutak94 closed 2 years ago

mutak94 commented 2 years ago

I will present here an issue that I am encountering when plotting the Croatian town of Zaprešić. One part of the city boundary doesn't get plotted when you trim the data and if you include exclude = FALSE, then it doesn't trim the neighbouring settlements.

The issue is structured as follows: 1) I will plot the city boundaries without trimming the data, 2) I will plot the city boundaries with trimming the data, 3) I will plot the city boundaries with exclude = FALSE and 4) I will post a screenshot from OpenStreetMap on how the boundary should really look like.

For convenience, here I will plot the problematic part of the boundary on OpenStreetMap: https://www.openstreetmap.org/way/1067231608#map=16/45.8839/15.8201

1 - Plot without trimming the data

library(osmdata)
library(sf)
library(tidyverse)

boundary <- getbb("Zaprešić", featuretype = "city") %>% 
  opq() %>% 
  add_osm_feature(key = "boundary", value = "administrative") %>% 
  osmdata_sf()

ggplot(boundary$osm_lines) +
  geom_sf()

This code will produce the plot of the whole Zagreb county, of which Zaprešić is just one part of. So therefore we would like to trim the data.

image

2 - Plot with trimming the data

bb <- getbb("Zaprešić", featuretype = "city", format_out = "polygon")

boundary <- getbb("Zaprešić", featuretype = "city") %>% 
  opq() %>% 
  add_osm_feature(key = "boundary", value = "administrative") %>% 
  osmdata_sf() %>% 
  trim_osmdata(bb)

ggplot(boundary$osm_lines) +
  geom_sf()

This will now produce a plot of just Zaprešić, however, as you can see, the northeastern part of the border is missing from the plot:

image

3 - Plot with trimming the data and exclude = FALSE

bb <- getbb("Zaprešić", featuretype = "city", format_out = "polygon")

boundary <- getbb("Zaprešić", featuretype = "city") %>% 
  opq() %>% 
  add_osm_feature(key = "boundary", value = "administrative") %>% 
  osmdata_sf() %>% 
  trim_osmdata(bb, exclude = FALSE)

ggplot(boundary$osm_lines) +
  geom_sf()

This will produce a plot which includes the missing border, but it will also include borders of all the neighbouring settlements/municipalities.

image

4 - How the border should look like

And finally, here I will post the screenshot from OpenStreetMap which shows how the border should look like. Note that this will just show the outer city border and not the internal border which is the border between the two city districts, however, that border is completely irrelevant currently.

image

So I was wondering what would be the proper way to have the desired plot?

mpadge commented 2 years ago

Thanks @mutak94. Everything seems to be working as expected, but that data you want are the polygons - in this case, as multipolygons - and not the lines. The polygon trimming works as expected:

library(osmdata)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.1, PROJ 9.0.1; sf_use_s2() is TRUE
library(ggplot2)

bb <- getbb("Zaprešić", featuretype = "city", format_out = "polygon")

boundary <- getbb("Zaprešić", featuretype = "city") |>
    opq() |>
    add_osm_feature(key = "boundary", value = "administrative") |>
    osmdata_sf() |>
    trim_osmdata(bb)

ggplot(boundary$osm_multipolygons) +
    geom_sf()

Created on 2022-10-27 with reprex v2.0.2

That trims down to only one of your expected two polygons. This is what your trimming polygon looks like:

library(osmdata)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.1, PROJ 9.0.1; sf_use_s2() is TRUE
library(ggplot2)

bb <- getbb("Zaprešić", featuretype = "city", format_out = "sf_polygon")
xy <- data.frame (st_coordinates (bb)) # used below to trim plot

boundary <- getbb("Zaprešić", featuretype = "city") |>
    opq() |>
    add_osm_feature(key = "boundary", value = "administrative") |>
    osmdata_sf()

mp <- boundary$osm_multipolygons$geometry

ggplot() +
    geom_sf(data = mp, colour = "blue", fill = NA) +
    geom_sf (data = bb, colour = "red", lwd = 0.5, lty = 2, fill = NA) +
    xlim (range (xy$X)) +
    ylim (range (xy$Y))

Created on 2022-10-27 with reprex v2.0.2

And it lies exactly on the boundary of the polygons you are trying to trim. This can lead to unexpected issues with numerical precision, which i suspect is what's happening here. The solution is simply to expand your bounding box a bit to ensure the objects you want trimmed definitely lie within the polygon:

library(osmdata)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.1, PROJ 9.0.1; sf_use_s2() is TRUE
library(ggplot2)

bb <- getbb("Zaprešić", featuretype = "city", format_out = "sf_polygon")
bb_exp <- st_buffer (bb, 0.1)

boundary <- getbb("Zaprešić", featuretype = "city") |>
    opq() |>
    add_osm_feature(key = "boundary", value = "administrative") |>
    osmdata_sf() |>
    trim_osmdata(bb_exp)

ggplot(boundary$osm_multipolygons) +
    geom_sf()

Created on 2022-10-27 with reprex v2.0.2

Conclusion

Just apply st_buffer to expand your trimming polygon, and set some arbitrarily small value - here, i've used 0.1 degrees.

Please close if that fixes it for you. Thanks1

mutak94 commented 2 years ago

Thank you for the thorough explanation of these mechanisms and your help with the solution.