inlabru-org / inlabru

inlabru
https://inlabru-org.github.io/inlabru/
78 stars 21 forks source link

gg.SpatialPolygons does not handle @data slot correctly #130

Closed ASeatonSpatial closed 2 years ago

ASeatonSpatial commented 2 years ago

I wanted to plot a polygon object and set aes(fill = covariate_name) where covariate_name is a covariate in a SpatialPolygonsDataFrame.

However this does not work:

library(INLA)
#> Warning: package 'INLA' was built under R version 4.1.2
#> Loading required package: Matrix
#> Loading required package: foreach
#> Loading required package: parallel
#> Loading required package: sp
#> This is INLA_21.12.01 built 2021-12-01 04:52:17 UTC.
#>  - See www.r-inla.org/contact-us for how to get help.
#>  - To enable PARDISO sparse library; see inla.pardiso()
library(inlabru)
library(ggplot2)
library(patchwork)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(SpatialEpi)  # contains scotland Lip cancer data 

# load data
data(scotland)
lip_data = scotland$spatial.polygon
lip_data$cases = scotland$data$cases

ggplot() +
  gg(lip_data, aes(fill = cases)) + 
  coord_equal() + 
  theme_minimal()
#> Regions defined for each Polygons
#> Error in FUN(X[[i]], ...): object 'cases' not found

This seems to be caused by fortify which doesn't store the information in the data slot of the SpatialPolygonsDataFrame.

test = fortify(lip_data)
#> Regions defined for each Polygons
head(test)
#>      long     lat order  hole piece            id           group
#> 1 131.769 856.564     1 FALSE     1 skye-lochalsh skye-lochalsh.1
#> 2 141.475 848.980     2 FALSE     1 skye-lochalsh skye-lochalsh.1
#> 3 135.941 869.855     3 FALSE     1 skye-lochalsh skye-lochalsh.1
#> 4 140.778 876.978     4 FALSE     1 skye-lochalsh skye-lochalsh.1
#> 5 146.218 873.198     5 FALSE     1 skye-lochalsh skye-lochalsh.1
#> 6 152.571 862.238     6 FALSE     1 skye-lochalsh skye-lochalsh.1

This change to gg.SpatialPolygons appears to fix the issue:

gg.SpatialPolygons <- function(data, mapping = NULL, crs = NULL, ...) {
  requireNamespace("ggplot2")
  if (!requireNamespace("ggpolypath", quietly = TRUE)) {
    stop("The 'ggpolypath' package is required for SpatialPolygons plotting, but it is not installed.")
  }
  if (!is.null(crs)) {
    data <- sp::spTransform(data, crs)
  }

  # join @data to object returned by fortify
  df <- ggplot2::fortify(data)  
  data$piece = as.factor(1:nrow(data))  
  df <- dplyr::left_join(df, data@data, by = "piece") 

  cnames <- names(df)[1:2]
  if (requireNamespace("ggpolypath", quietly = TRUE)) {
    dmap <- ggplot2::aes(
      x = .data[[cnames[1]]],
      y = .data[[cnames[2]]],
      group = .data[["group"]]
    )
  } else {
    dmap <- ggplot2::aes(
      x = .data[[cnames[1]]],
      y = .data[[cnames[2]]],
      group = .data[["id"]],
      subgroup = .data[["hole"]]
    )
  }

  if (!is.null(mapping)) {
    dmap <- modifyList(dmap, mapping)
  }

  params = list(...)
  arg = list(data = df,
             mapping = dmap)

  if (!any(names(params) %in% c("colour", "color"))) {
    arg$colour <- "black"
  }
  if (!any(names(params) %in% "alpha")) {
    arg$alpha <- 0.1
  }
  arg = modifyList(arg, params)

  if (requireNamespace("ggpolypath", quietly = TRUE)) {
    do.call(ggpolypath::geom_polypath, arg)
  } else {
    do.call(ggplot2::geom_polygon, arg)
  }
}

ggplot() +
  gg.SpatialPolygons(lip_data, 
                     aes(fill = cases), 
                     alpha = 1) +
  scale_fill_viridis_c() +
  coord_equal() + 
  theme_minimal()
#> Regions defined for each Polygons

Created on 2022-01-11 by the reprex package (v2.0.1)

ASeatonSpatial commented 2 years ago

Ah just noticed that this will give an error if there is no data slot. Adding a check e.g. via inherits(data, "SpatialPolygonsDataFrame") would solve that.

finnlindgren commented 2 years ago

Yes, ggplot2::fortify appears to mostly exist for backward compatibility. I notice that in gg.SpatialPixelsDataFrame we use as.data.frame() instead. The main trick is to keep track of the coordinate names, which ggplot2::fortify breaks, leading to the various extra code in the gg functions to get around that. Looking at the code from ggplot2:::fortify.SpatialPolygonsDataFrame, it's not clear what it even tries to do with the data part. It does start with as.data.frame(), but then by default doesn't use the result, which I'd argue is a bug in fortify. The fortify documentation points to broom, that points to broom::tidy and broom::sp_tidiers, which will be deprecated in favour of sf::st_as_sf. We may need some transition helpers of our own before fully adding sf support.

finnlindgren commented 2 years ago

But the broom:::tidy.SpatialPolygonsDataFrame code may have the same issue as fortify.

finnlindgren commented 2 years ago

The missing data slot is more to do with not having a gg.SpatialPolygonsDataFrame. A bit odd that.

finnlindgren commented 2 years ago

The piece index is an index within each polygon. The appropriate join-variable is id:

  df <- ggplot2::fortify(data)
  if (inherits(data, "SpatialPolygonsDataFrame")) {
    data[["id"]] <- unique(df[["id"]])
    df <- dplyr::left_join(df, as.data.frame(data), by = "id")
  }

I'll also change the default alpha value from 0.1 to 0.2; perhaps the default should be 1.

finnlindgren commented 2 years ago
library(INLA)
#> Loading required package: Matrix
#> Loading required package: foreach
#> Loading required package: parallel
#> Loading required package: sp
#> This is INLA_99.99.9999 built 2022-01-11 19:37:35 UTC.
#>  - See www.r-inla.org/contact-us for how to get help.
#>  - To enable PARDISO sparse library; see inla.pardiso()
library(inlabru)
library(ggplot2)
library(patchwork)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(SpatialEpi)  # contains scotland Lip cancer data

# load data
data(scotland)
lip_data = scotland$spatial.polygon
lip_data$cases = scotland$data$cases

ggplot() +
  gg(lip_data, aes(fill = cases), alpha=1) +
  coord_equal() +
  theme_minimal()
#> Regions defined for each Polygons

Created on 2022-01-11 by the reprex package (v2.0.1)