ipeaGIT / geobr

Easy access to official spatial data sets of Brazil in R and Python
https://ipeagit.github.io/geobr/
778 stars 116 forks source link

Error: Edge 46 crosses edge 48 #340

Open marianaosb opened 5 months ago

marianaosb commented 5 months ago

Hi, I've used read_municipality(simplified = FALSE) to load the date by municipality. But when I try to make a Queen neighbours Matrix, the following erro occur:

Error in wk_handle.wk_wkb(wkb, s2_geography_writer(oriented = oriented, : Loop 0 is not valid: Edge 46 crosses edge 48

How can I fix that?

marianaosb commented 5 months ago

Is it correct to use the sf_use_s2(FALSE) command before running the neighborhood matrix?

rafapereirabr commented 5 months ago

could you please share the code with a reproducible example?

rafapereirabr commented 3 months ago

@marianaosb, could you please share the code with a reproducible example that returns this error ?

gabrielmagno commented 3 months ago

Hi @rafapereirabr !

I think I had a similar problem as @marianaosb when trying to use sf::st_union to "unify" shapes of a given group of municipalities.

My real groups are different, but to create an easy reproducible example let's say that I have all the municipalities and want to unify their shape by the region (I know this could be accomplished by using directly the states dataset, but as I mentioned this is just a toy example). This is the initial code:

library(dplyr)
library(ggplot2)
library(geobr)
library(sf)

cities <- read_municipality(code_muni="all", simplified=TRUE)

cities |> 
    group_by(abbrev_state) |> 
    summarise(
        geom = geom |> st_union()
    ) |> 
    ggplot() +
        geom_sf(fill="#2D3E50", color="#FEBF57", size=.15, show.legend = FALSE) +
        theme_minimal()

Which gives this error:

Error in `summarise()`:
ℹ In argument: `geom = st_union(geom)`.
ℹ In group 1: `abbrev_state = "AC"`.
Caused by error in `wk_handle.wk_wkb()`:
! Loop 0 is not valid: Edge 253 crosses edge 255

If I set simplified=FALSE the same error appears, but in a different group/state:

Error in `summarise()`:
ℹ In argument: `geom = st_union(geom)`.
ℹ In group 11: `abbrev_state = "MG"`.
Caused by error in `wk_handle.wk_wkb()`:
! Loop 0 is not valid: Edge 46 crosses edge 48

The way I was able to fix the error was by setting globally setting this sf flag:

sf_use_s2(FALSE)

This works both for simplified=FALSE and simplified=TRUE, but with different results (which is expected I think, considering that one is... simplified geometry, and the other is complete).

For simplified=TRUE this is the resulting map: image

For simplified=FALSE this is the resulting map: image

The error is gone, but the state shapes still have some internal polygons. I was able to get rid of these by using sfheaders::sf_remove_holes (or nngeo::st_remove_holes) after st_union.

When using sf_remove_holes with simplified=TRUE this is the resulting map: (notice some minor artifacts inside some states, like a straight line in Bahia, Rondônia and Roraima states) image

When using sf_remove_holes with simplified=FALSE this is the resulting map: (this is the best version, with clean shapes for all states) image

For reference, this is the full code of this last image map, which in my opinion is the best:

library(dplyr)
library(ggplot2)
library(geobr)
library(sf)
sf_use_s2(FALSE)

cities <- read_municipality(code_muni = "all", simplified=FALSE)

cities |> 
    group_by(abbrev_state) |> 
    summarise(
        geom = geom |> st_union() |> sfheaders::sf_remove_holes()
    ) |> 
    ggplot() +
    geom_sf(fill="#2D3E50", color="#FEBF57", size=.15, show.legend = FALSE) +
    theme_minimal()

@rafapereirabr I'm reporting all of these to (1) help other people that might be also trying to union some shapes and having issues, and (2) to help you debugging a potential bug in the shapes dataset or the library. I say potential because I'm not sure if there is really any problem with the library and datasets, but I hope that my reports could help understand better what is happening.

rafapereirabr commented 3 months ago

Hi @gabrielmagno , thanks a lot for sharing this. It's super helpful. When users need to do any operation of the geometry column (e.g. calculating areas, merging/dissolving borders, creating contiguity matrices etc), it is strongly recommended to download the data using simplified = FALSE.

This issue you both have reported suggests some issue with the geometry might still persist for some years even when setting simplified = FALSE. I will have a closer look at this in the next few weeks.