Open zeehio opened 1 year ago
library(bench)
library(ggplot2)
library(cowplot)
library(reshape2) # for melt
library(forcats)
library(grid)
library(rlang)
We profile the geom_raster()
naïve strategy. We see most of the
time is spent training positional scales. We know that on a matrix
we just care about the corners.
We define a custom geom_matrix_raster()
that simplifies the
handling of the positional scales.
The time spent goes from 45 seconds to 18 seconds. Still far from the 5.5 seconds we are targeting.
On the next message, we will profile the custom geom to understand where is room for optimization
pv <- profvis::profvis({
gplt <- naive_strategy(mat_big)
benchplot(gplt)
})
pv
I don’t know how to summarize and represent the output of profvis in a plot easily, so I’ll just provide here the interpretation of my result:
Total time: 38 seconds
Our naive strategy is spending a huge amount of time in mapping the position, but our matrix should only care for the corners. Let’s write first a better strategy, with our own geom.
If you want to try this geom, you can try installing my ggmatrix
package:
remotes::install_github("zeehio/ggmatrix")
# Copyright 2022 Sergio Oller Moreno <sergioller@gmail.com>
# This file is part of the ggmatrix package and it is distributed under the MIT license terms.
# Check the ggmatrix package license information for further details.
#' Raster a matrix as a rectangle, efficiently
#'
#'
#' @param matrix The matrix we want to render in the plot
#' @param xmin,xmax,ymin,ymax Coordinates where the corners of the matrix will
#' be centered By default they are taken from rownames (x) and colnames (y) respectively.
#' @param interpolate If `TRUE`, interpolate linearly, if `FALSE` (the default) don't interpolate.
#' @param flip_cols,flip_rows Flip the rows and columns of the matrix. By default we flip the columns.
#' @inheritParams ggplot2::geom_raster
#'
#' @export
geom_matrix_raster <- function(matrix, xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL,
interpolate = FALSE,
flip_cols = TRUE,
flip_rows = FALSE,
show.legend = NA,
inherit.aes = TRUE)
{
data <- data.frame(values = c(matrix))
mapping <- aes(fill = .data$values)
if (is.null(xmin)) {
xmin <- as.numeric(rownames(matrix)[1L])
}
if (is.null(xmax)) {
xmax <- as.numeric(rownames(matrix)[nrow(matrix)])
}
if (is.null(ymin)) {
ymin <- as.numeric(colnames(matrix)[1L])
}
if (is.null(ymax)) {
ymax <- as.numeric(colnames(matrix)[ncol(matrix)])
}
if (nrow(matrix) > 1L) {
x_step <- (xmax - xmin)/(nrow(matrix) - 1L)
} else {
x_step <- 1
}
if (ncol(matrix) > 1L) {
y_step <- (ymax - ymin)/(ncol(matrix) - 1L)
} else {
y_step <- 1
}
# we return two layers, one blank to create the axes and handle limits, another
# rastering the matrix.
corners <- data.frame(
x = c(xmin - x_step/2, xmax + x_step/2),
y = c(ymin - y_step/2, ymax + y_step/2)
)
corners_xy <- corners
x_y_names <- names(dimnames(matrix))
if (is.null(x_y_names)) {
x_y_names <- c("rows", "columns")
}
colnames(corners) <- x_y_names
x_name <- rlang::sym(x_y_names[1L])
y_name <- rlang::sym(x_y_names[2L])
list(
layer(
data = corners, mapping = aes(x=!!x_name, y=!!y_name), stat = StatIdentity, geom = GeomBlank,
position = PositionIdentity, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(), check.aes = FALSE
),
layer(
data = data,
mapping = mapping,
stat = StatIdentity,
geom = GeomMatrixRaster,
position = PositionIdentity,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list2(
mat = matrix,
matrix_nrows = nrow(matrix),
matrix_ncols = ncol(matrix),
corners = corners_xy,
flip_cols = flip_cols,
flip_rows = flip_rows,
interpolate = interpolate
)
)
)
}
GeomMatrixRaster <- ggproto(
"GeomMatrixRaster", Geom,
non_missing_aes = c("fill"),
required_aes = c("fill"),
default_aes = aes(fill = "grey35"),
draw_panel = function(self, data, panel_params, coord, mat, matrix_nrows, matrix_ncols,
corners, flip_cols, flip_rows, interpolate) {
if (!inherits(coord, "CoordCartesian")) {
rlang::abort(c(
"GeomMatrixRaster only works with coord_cartesian"
))
}
corners <- coord$transform(corners, panel_params)
if (inherits(coord, "CoordFlip")) {
byrow <- TRUE
mat_nr <- matrix_ncols
mat_nc <- matrix_nrows
nr_dim <- c(matrix_nrows, matrix_ncols)
} else {
byrow <- FALSE
mat_nr <- matrix_nrows
mat_nc <- matrix_ncols
nr_dim <- c(matrix_ncols, matrix_nrows)
}
x_rng <- range(corners$x, na.rm = TRUE)
y_rng <- range(corners$y, na.rm = TRUE)
mat <- matrix(
farver::encode_native(data$fill),
nrow = mat_nr,
ncol = mat_nc,
byrow = byrow
)
if (flip_cols) {
rev_cols <- seq.int(mat_nc, 1L, by = -1L)
mat <- mat[, rev_cols, drop = FALSE]
}
if (flip_rows) {
rev_rows <- seq.int(mat_nr, 1L, by = -1L)
mat <- mat[rev_rows, drop = FALSE]
}
nr <- structure(
mat,
dim = nr_dim,
class = "nativeRaster",
channels = 4L
)
rasterGrob(nr, x_rng[1], y_rng[1],
diff(x_rng), diff(y_rng), default.units = "native",
just = c("left","bottom"), interpolate = interpolate)
},
draw_key = draw_key_rect
)
efficient_strategy <- function(mat) {
gplt <- ggplot() +
geom_matrix_raster(matrix = mat) +
scale_fill_gradient(trans = "log2")
gplt
}
cowplot::plot_grid(
naive_strategy(mat_small) + labs(title = "naive"),
efficient_strategy(mat_small) + labs(title = "efficient"),
ncol = 2
)
Same results, how about performance?
bm_efficient <- bench::mark(
efficient = {
gplt <- efficient_strategy(mat_big)
benchplot(gplt)
},
iterations = 1L
)
## Warning: Some expressions had a GC in every iteration; so filtering is disabled.
bm_efficient
## # A tibble: 1 × 6
## expression min median `itr/sec` mem_alloc `gc/sec`
## <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
## 1 efficient 19.4s 19.4s 0.0516 5.61GB 0.826
Our new strategy is much better than geom_raster()
, but we are still far from our target.
strategies <- rbind(
bm_baseline,
bm_fast_fair,
bm_efficient
)
strategies
## # A tibble: 3 × 6
## expression min median `itr/sec` mem_alloc `gc/sec`
## <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
## 1 naive_strategy 49.34s 49.34s 0.0203 28.59GB 0.669
## 2 fast_fair 6.35s 6.35s 0.157 1.79GB 0.630
## 3 efficient 19.36s 19.36s 0.0516 5.61GB 0.826
ggplot(strategies) +
geom_col(aes(
x = forcats::fct_reorder(as.character(expression), as.numeric(min)),
y = as.numeric(min),
fill=as.character(expression))
) +
coord_flip() +
labs(x = "Strategy", y = "CPU time (s)") +
guides(fill = "none")
Our efficient strategy is twice as fast compared to geom_raster()
, but
according to our fast but fair implementation it still could be three
times faster.
We will next profile and target the slowest parts of the efficient strategy, to improve its performance.
library(bench)
library(ggplot2)
library(cowplot)
library(reshape2) # for melt
library(forcats)
library(grid)
library(rlang)
We profile the efficient strategy described before. The main bottlenecks in the code are:
pv <- profvis::profvis({
gplt <- efficient_strategy(mat_big)
benchplot(gplt)
})
pv
I don’t know how to summarize and represent the output of profvis in a plot easily, so I’ll just provide here the interpretation of my result:
Total time: 18 seconds
2.5 seconds training and mapping positions, Spent in
match_id <- match(layer_data$PANEL, layout$PANEL)
,
10 seconds in self$map
, mapping the fill
to colours. This
includes
self$na.value
.2.8 seconds drawing the geom
The next message will cover improvements in scale mapping.
Hi @thomasp85,
I rebased all patches and I've sent all the remaining pull requests (14 in total)!. I have prepared a summary of all the PR and a couple of plots comparing the performance before and applying them. I'd be happy to introduce the pull requests to you if you like, whenever this suits you :+1:
This takes care of all the bottlenecks related to this issue.
There is one more possible optimization related to oob checking in the scales package, but I will leave that for the future.
Summary
We see how a plot of a 4k x 3k matrix can be made around 10 times faster than when using
geom_raster()
(45 seconds to 5.6 seconds).The differences in performance tell us that
geom_raster()
may not be the best choice to rasterize a matrix.We can bring the timing further down to 1.5 seconds if we omit handling of missing values and reduce the palette of colours, but these are shortcuts ggplot2 can’t make.
In this issue we will see an efficient way of rasterizing a matrix. We'll see which are the main ggplot2 bottlenecks affecting the performance of that efficient approach and we'll get to some pull requests to address those issues.
This issue is structured in several messages:
All code is included just for reproducibility, but it is not expected that the you linger in the details.
If I happen to call your attention, I'm looking for opportunities ideally starting around summer 2023. Happy to do remote work from Barcelona (Spain) or Mexico and open to relocation if needed.
Introduction: A small example
On my field of work it is a common case to have a matrix that we have to plot with something similar to
filled.contour
orgeom_raster
. The matrix has two associated axes, one for rows and one for columns. We often need a scale transformation.Here is a small example of the data:
We can use
geom_raster()
to plot it. Let’s call this the naïve strategy, because we just use ggplot in a naïve way. This approach is great and it works, but we’ll see that it does not scale very well…Scaling with the naïve strategy:
The same problem, with a 4000 x 3000 matrix:
This is how it looks like:
Cutting corners strategy:
This is “as fast as I can make it”. It is useful as a reference for what can be done, but it is not realistic to expect
ggplot2
to be this fast, due to these shortcuts being taken:Let’s apply this to the small matrix:
Fast and fair strategy
If we avoid cutting corners, we can still get quite decent performance. Here we take care of missing values (if there was any) and we don’t limit the palette to 256 colours.
Let’s apply this to the small matrix to assess correctness visually:
Comparison of all strategies:
We can see how
ggplot2
creates a lot of extra copies, it allocates and deallocates 28GB of RAM. This hints for room for improvement.ggplot2 is taking more than 40 seconds, while other approaches need close to 5 seconds.
There is clearly room for improvement and that’s my goal to address in this issue.