thomasp85 / ggforce

Accelerating ggplot2
https://ggforce.data-imaginist.com
Other
916 stars 105 forks source link

Reusing previously created geom_mark_hull() layer without recalculating? #321

Open razorofockham opened 6 months ago

razorofockham commented 6 months ago

Thanks for the really nice package! I read in the documentation that "The hull is calculated at draw time, and can thus change as you resize the plot". I was wondering if there could be an easy way to reuse a previously calculated hull layer while overriding the recalculation, for instance when aiming at plotting the exact same hull over the same collection of points with an underlying layer that is changing. Below is a toy example of what I am trying to achieve. Here, plotDF would be a data frame with coordinates X/Y for points in a square grid, with each of these points having intensities Z for a set of different variables Var, and some (but not all) of these points in turn being univocally assigned to certain regions (specified by the column Region).

library(dplyr)
library(ggplot2)
library(ggforce)

hullDF <- plotDF[,c("X","Y","Region")] %>% filter(!is.na(Region)) %>% distinct()   # Unique X/Y/Region triplets
ggHull <- geom_mark_hull(data = hullDF, aes(x = X, y = Y, group = Region))         # Create hull layer before loop

for (V in unique(plotDF$Var)) {
    ggRasterAndHull <- ggplot(data = subset(plotDF, Var==V)) +
                       geom_raster(aes(x = X, y = Y, fill = Z)) +
                       ggHull    # Create raster for variable "V" overlaid with hull
    print(ggRasterAndHull )
}

The issue I find is that, as mentioned in the documentation, the hull layer gets recalculated at every iteration (I assume upon the call print()?), which in this case would not really be necessary, but takes the most time out of all the steps in the loop.

Thanks a lot for any guidance/hints you can provide!

thomasp85 commented 6 months ago

The hull is actually created way later than print(). It is calculated right before the device draws it, meaning that it will update as the device resizes. This is necessary as it uses the absolute position in device space to calculate the hull (this is relevant for concave hulls but not for convex hulls)

So there is not really any meaningful way that it can be made an available and reused. Are you running into performance issues with it since you feel the need to reuse it?

razorofockham commented 6 months ago

Hi Thomas, thanks for the prompt reply. Indeed, the motivation is improving performance, since in practice I am using this to generate a PDF report with about 100-150 different Var variables, and for each of them I overlay a total of around 150-200 Region hulls (with 20-50 X/Y points per region), so the impact of the hull calculation is quite major (~10 sec per iteration vs ~0.5 sec if I do not add the hulls). Since my PDF device size should not change, I thought this hull-reusing idea could maybe be exploited to accelerate things, but if there is no easy way to go about it, then I'll just live with it. Thanks so much anyway!