rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
541 stars 89 forks source link

Utilise a function when rasterizing overlapping polygons #1041

Open Bartesto opened 1 year ago

Bartesto commented 1 year ago

Hi,

I am wondering if there is a way to apply a function to rasterization when working on overlapping polygons?

Context. I have a large multi-polygon dataset where each polygon represents the footprint of a bushfire and is attributed with the year it occurred. The dataset contains records back to 1937 for the whole of Western Australia. One task is to calculate the year since an area (or the whole State) last burnt.

I rasterize the vectors but I need to order them first by the year for this to create the correct output and the ordering imposes a time penalty when compared to something like fasterize::fasterize which can accept a function such as "max" which does the job.

Here's a toy example.

library(terra)

x1 <- rbind(c(2, 3), c(7, 3), c(7, 6), c(2, 6))
x2 <- rbind(c(3, 4), c(9, 4), c(9, 8), c(3, 8))
x3 <- rbind(c(4, 1), c(6, 1), c(6, 9), c(4, 9))
z <- rbind(cbind(object=1, part=1, x1, hole=0), cbind(object=2, part=1, x3, hole=0),
           cbind(object=3, part=1, x2, hole=0))
colnames(z)[3:4] <- c('x', 'y')
p <- vect(z, "polygons")
d <- data.frame(id = 1:3, year = c(2023, 1990, 2021))
values(p) <- d
plot(p, "year")

1stplot

What I am after can be created by:

template <- rast(p, res = 1)
rst <- p[order(p$year),] |>
  rasterize(template, field = "year")

or

library(sf)
library(raster)
library(fasterize)

sf_p <- st_as_sf(p) # fasterize works on sf class

raster_template <- raster(sf_p, res = 1) # fasterize works with raster class template

rst_fast <- fasterize(sf = sf_p, raster = raster_template, 
                             field = "year", fun = "max")

Either of these methods when plotted produce the below which has the rasterized polygons layered in descending order, with 2023 (green) on top, followed by 2021 then 1990. With multi-polygon datasets in the order of 1000's + polygons rasterize is half as fast when ordering first and I note that a function can only be applied to a point vector dataset.

2ndplot

Thanks Bart

rhijmans commented 1 year ago

raster::rasterize also has that option (but is especially slow). With terra::rasterize you can do "sum=TRUE", but I will add min, max and mean (and use the "fun=" argument for that).

Here is another approach (that I would not want to use with large datasets)

u <- union(p)
tu <- t(data.frame(u))
i <- tu==0
tu[tu==1] <- p$year[tu * (1:nrow(tu))]
tu[i] <- NA
u$year <- apply(tu, 2, max, na.rm=TRUE)
template <- rast(p, res = 1)
r <- rasterize(u, template, field = "year")
Bartesto commented 1 year ago

Thanks @rhijmans. Really interesting to see that work around and appreciate you adding to the rasterize function. I'm weaning some folks off long running vector work in a GIS and this will go a very long way.

Regards Bart

Farewe commented 1 year ago

Hello @rhijmans ,

Will there be a possibility to use a custom function to rasterize polygons in the future?

A typical example would be calculating rasters of species richness from IUCN range maps, like the good old example of raster's rasterize: fun=function(x, ...) {length(unique(na.omit(x)))}

Switching from raster to terra in teaching practicals has made this task more complex to explain than in raster. It requires pre-processing of polygons which makes the process much slower in terra's rasterize than with raster's raterize. See for example how @damariszurell handled it by rasterizing all polygons individually here. In my own practicals I opted to union the polygons beforehand and use the sum function (an example in section #3.3 here).

rhijmans commented 1 year ago

@Farewe: you can use fun="sum" or that. I have now also added fun="count" to make that intent clearer.

library(terra)
v <- vect(system.file("ex/lux.shp", package="terra"))
v <- buffer(v, 10000)
r <- rast(system.file("ex/elev.tif", package="terra"))
x <- rasterize(v, r, fun="sum")

# now you can also do
y <- rasterize(v, r, fun="count")