rstudio / packrat

Packrat is a dependency management system for R
http://rstudio.github.io/packrat/
401 stars 89 forks source link

"Non recursive" optimisation - 10x speed improvement for our RStudio Connect deployments #603

Closed slodge closed 4 years ago

slodge commented 4 years ago

This PR introduces a global base R level option to change the recursive search option used by .snapshotImpl when calling getPackageRecords for inferred packages.

This PR is inspired by a previous commit https://github.com/rstudio/packrat/commit/ac1a18cf856f440b64c5db44401c350b16b94636 referenced by @ras44 in #347 - but then reversed as "it broke a couple internal RStudio applications" (@kevinushey)

Testing locally this has significantly reduced rsconnect::writeManifest times for our Shiny apps while producing the same manifest results.

I'm not expecting this PR to be approved "as is" - while it's worked for all the apps I've tested with, I expect it may have implications in other environnments.

I'm also not glued to this change being introduced as a global R option - we just went that route as it required a minimal change and as it's easy to turn on/off during testing.

I'd be happy to consider alternative fixes and to put work into investigating them - but currently we need to do something as the RStudio Connect deployment experience is far too slow for our users. An optimisation like this one would make a significant difference to the perceived usability of our RStudio Connect instances.

I did try the other fix suggested in #347 - using packrat::set_opts(ignored.directories=c("all","my","R","src","directories") - however, this didn't yield any performance improvements for me even when modifying the rsconnect package to use different snapshotImpl arguments.

Referencing:

347

532

Possibly also linked to other speed issues - e.g. #290

Measurement script:

options(packrat.RecursiveInference = T)
tictoc::tic()
rsconnect::writeManifest(appDir="../C19", appFiles = "app.R" )
tictoc::toc(log = T)

options(packrat.RecursiveInference = F)
tictoc::tic()
rsconnect::writeManifest(appDir="../C19", appFiles = "app.R" )
tictoc::toc(log = T)

Test app.R

library(shiny)
library(tidyverse)
library(coronavirus)

all_countries <- coronavirus %>%
  pluck("country") %>%
  unique() 

ui <- fluidPage(

  titlePanel("C19"),

  sidebarLayout(
    sidebarPanel(
      selectInput(
        "country",
        label = "Which Country?", 
        choices =  all_countries, 
        selected = "United Kingdom",
        multiple = TRUE,
        selectize = TRUE)
    ),

    mainPanel(
      plotOutput("thePlot")
    )
  )
)

server <- function(input, output) {

  output$thePlot <- 
    renderPlot({

      coronavirus %>% 
        filter(type == "confirmed",
               country %in% input$country) %>%
        ggplot(aes(date, cases, colour=country)) +
        geom_line() +
        facet_wrap(~country)

  })
}

shinyApp(ui = ui, server = server)

# runApp()
kevinushey commented 4 years ago

LGTM -- thanks for the PR!

rmccombie commented 3 years ago

Feedback: 55750

aronatkins commented 3 years ago

Also related is our more recent work in https://github.com/rstudio/packrat/pull/615, where packrat avoids repeatedly processing the same dependency.

ras44 commented 3 years ago

@kevinushey @aronatkins we've heard recommendations on moving towards Renv. Do you have any thoughts on whether these types of improvements will pass through to Renv? I'm a bit unclear if Renv is a layer over packrat, or something altogether separate. Thanks!

kevinushey commented 3 years ago

renv is effectively a ground-up rewrite of Packrat, but done in a way to be mostly compatible with Packrat, or at least the API should feel familiar to those experienced with Packrat. We plan to continue maintaining Packrat for critical fixes required for RStudio Connect, but future development will focus on the renv package.

In the future, we aim to eventually migrate the deployment process to use renv rather than Packrat, but that work has not yet been scheduled.

ras44 commented 3 years ago

@kevinushey thanks for your quick response. So would it be safe to say that the renv rewrite doesn't suffer from the performance issues that this PR and others referenced appear to be trying to resolve in packrat? Just trying to gauge whether we should switch to renv for superior performance, or continue using packrat but with these fixes. Thanks for your guidance!

kevinushey commented 3 years ago

That's correct. In general, you can view renv as a better Packrat. (If there's something that Packrat does that renv does not that you require, please let me know as well)