rstudio / shiny

Easy interactive web applications with R
https://shiny.posit.co/
Other
5.37k stars 1.86k forks source link

Socket issue with large maps #3086

Closed ifellows closed 1 year ago

ifellows commented 4 years ago

I am developing several GIS applications using Shiny that are quite data intensive (i.e. large files). The applications work perfectly either locally or within a LAN, however, when I go to deploy them a mysterious socket issue occurs that kills the application. Sometimes the error will occur on launch, and other times it will occur during the manipulation of controls.

Nothing appears in the server logs, but in the javascript console I get either

WebSocket connection to 'wss://fellstat.app/shiny/app_direct/testapp/websocket/' failed: Could not decode a text frame as UTF-8.

or the errors in the image below. Poking around a bit in the javascript code, it seems like the socket is getting dropped.

I've tried deploying via both ShinyProxy and shinyapps.io, and both give the same error. The error is also occurring both with implementations using leaflet and those using mapdeck, which is why I'm thinking this is a more general shiny issue. I've trimmed down and de-identified the application. It is linked below. I've also tried to organize it so the error occurs every time on load. The size is rather large because this only really happens with bigger data.

Any guidance you can give would be appreciated. Additionally, my organization is currently in the process of deploying RStudio Connect, so it would be useful to know if there are any configuration options we can put in place to prevent this.

shinyapps.io = https://fellstat.shinyapps.io/shiny_app/ ShinyProxy = https://fellstat.app/shiny/app_direct/testapp Code = https://www.dropbox.com/s/azi1ji5yw84a0jr/shiny_app.zip?dl=1

Screen Shot 2020-10-02 at 1 24 14 PM

jcheng5 commented 4 years ago

Hmmm, this is a new one. Is Windows your primary development platform? And specifically, is that the platform on which data_deid.RData was created?

ifellows commented 4 years ago

Both were done on a rocker/rstudio docker container running:

Distributor ID: Debian
Description:    Debian GNU/Linux 10 (buster)
Release:        10
Codename:       buster
jcheng5 commented 4 years ago

I'm able to reproduce on Shiny Server. Wireshark shows the websocket data being truncated at about 135KB (looks like the data is coming from the mapdeck package). I'm very surprised to see this; Leaflet routinely sends this amount of data I think.

(Incidentally, did you really mean for the popup_html to be all those numbers? Or is that just an artificial thing to repro the issue easily?)

ifellows commented 4 years ago

Yes it is artificial. In reality it is a nice html formatted table giving info about that facility. I made it about the same character length.

Note that this behavior is, if anything, worse in leaflet. I switched to mapdeck because of this issue.

jcheng5 commented 4 years ago

OK. The message your app is sending is actually not 135KB, it's 121MB. It turns out that these different hosting servers all have their own per-message limits, I believe due to the WebSocket client implementations being used. (I also assume that in all cases the WebSocket client libraries are configurable, in which case these limits could be changed.)

You can use this app to test on whatever platform you like. The JS console will indicate when the client receives the message from the server.

library(shiny)

ui <- fluidPage(
  numericInput("bytes", "Bytes", 2^26),
  actionButton("go", "Go"),
  tags$script(HTML(
    "Shiny.addCustomMessageHandler('test', msg => console.log(msg.length + ' bytes received'));"
  ))
)

server <- function(input, output, session) {
  observeEvent(input$go, {
    showNotification(list(input$bytes, " bytes being sent"))
    session$sendCustomMessage("test",
      paste0(collapse = "", rep_len("0", input$bytes)))
  })
}

shinyApp(ui, server)

Thanks for bringing this to our attention, I don't know how I didn't know this before.

ifellows commented 4 years ago

Wow! Thank you so much for looking into this so quickly. It is very interesting. The app sent is a bit heavier than what I'm deploying in order to make it reproducible. I tried mucking about with the websocket configuration options in shiny server to no avail. Is there some combination that might help with this? Also, if the issue is platform based, why is there a difference between deploying locally and remotely? I had initially thought that it might be related to a timeout issue.

One thing I noticed is that if I click the controls a bunch on my app before the first change has a chance to render, it will generally trigger the error, even if any one of the options alone is safe to select. Are the messages concatenated together to reach a size that triggers the limit?

jcheng5 commented 4 years ago

The limit is configurable in the underlying websocket library for Shiny Server, but it’s not exposed as a setting (I didn’t know it existed until now).

By “platform” I just meant hosting software or service. My assumption is it’s the proxy layer that’s causing this every time.

To work around this, can you split the data frame and send one chunk at a time? This will also have the nice side effect of giving the user a progressive render experience.

I don’t know about the multiple messages interacting with each other; I feel like I saw something similar on ShinyApps.io, but as far as I can tell from looking at the code this shouldn’t be the case for Shiny Server.

ifellows commented 4 years ago

I tried chunking the data, which works and renders progressively when deployed locally, but I'm seeing the same error on shinyapps.io. I replaced the mapdeck_update line with:

    sq <- unique(ceiling(seq(from=0, to=nrow(df_plot_sub), length=40)))
    for(i in 1:(length(sq)-1)){
      df <- df_plot_sub[(sq[i]+1):sq[i+1],]
      mapdeck_update(map_id = "map") %>%
        clear_scatterplot(paste0("scatter",i))%>%
        add_scatterplot(
          data = df,
          lat = "latitude",
          lon = "longitude",
          fill_colour = legend_label,
          #stroke_width=4,
          #stroke_colour = "fill_color",
          #tooltip = "pop",
          tooltip = "popup_html",
          radius = 1500,
          radius_min_pixels = 3,
          palette = "reds",
          legend=FALSE,
          update_view=FALSE,
          #auto_highlight = FALSE,
          layer_id=paste0("scatter",i)
        )
    }
jcheng5 commented 4 years ago

Hmmm, your code works fine on SSP, so it does seem like there's a different problem on ShinyApps.io. Here's a new and improved version of the test app that can send multiple big payloads at a time:

library(shiny)

ui <- fluidPage(
  numericInput("bytes", "Send this many bytes per payload", 2^26),
  numericInput("times", "Send the payload this many times", 1),
  radioButtons("datatype", "Type of payload data", c(
    "Random" = "random",
    "Zeros" = "zeros"
  )),
  actionButton("go", "Send to client"),
  tags$script(HTML(
    "Shiny.addCustomMessageHandler('test', msg => console.log(msg.length + ' bytes received'));"
  ))
)

server <- function(input, output, session) {
  observeEvent(input$go, {
    data <- switch(input$datatype,
      random = base64enc::base64encode(readBin("/dev/urandom", raw(), n = round(input$bytes * 0.75))),
      zeros = paste0(collapse = "", rep_len("0", input$bytes))
    )
    for (i in seq_len(input$times)) {
      showNotification(list(input$bytes, " bytes being sent"))
      session$sendCustomMessage("test", data)
    }
  })
}

shinyApp(ui, server)

I find that on ShinyApps.io, sending 20MB x 10 causes a disconnect, while the same thing is handled no problem on SSP and Connect. The latter is really mystifying as ShinyApps.io is built around Connect...

jcheng5 commented 4 years ago

Oh, I forgot, the default RAM limits on ShinyApps.io are pretty low. I bumped it up to 4GB on my instance and 20MB x 10 works just fine now.

ifellows commented 4 years ago

Thanks Joe,

I upgraded my shiny.io account to check, and yes it works so long as I (the client) have sufficient bandwidth and low enough latency. When I bounce off a proxy (in Norway for example), I get disconnects and errors showing up in the javascript console like:

shiny-server-client.min.js:1 Mon Oct 05 2020 17:01:55 GMT-0700 (Pacific Daylight Time) [DBG]: 2 message(s) discarded from buffer

I'm not sure if there is another timeout issue layering onto the large message size issue. I also believe that there is an issue with ShinyProxy perhaps trying to do something too smart with the messages and causing the Could not decode a text frame as UTF-8 error, as I only see that in ShinyProxy. It seems like Connect will be coming online at a fortunate time, as my current ShinyProxy set-up cannot deploy these apps.

The solution I've come up with, that seems to work even bouncing off of Syrian servers, is to wait for a round trip from the client before sending each chunk of data. The outline of the solution is as follows:

server.R

  # The data to be plotted
  plot_data <- reactiveVal(NULL)

  # On change of data set index to 1
  observe({
   plot_data()
   updateNumericInput(session, "index",value=1)
  })

  # Plot a chunk of data and increment the index
  observeEvent(input$index, {
   df <- isolate(plot_data())
   if(is.null(df))
     return(NULL)
   i <- input$index
   if(i < 1){
     return(NULL)
   }

   sq <- unique(ceiling(seq(from=0, to=nrow(df), length=100)))
   if(i > length(sq)-1){
     updateNumericInput(session, "index",value=-1)
     return(NULL)
   }

   # Clear map at start of "loop"
   if(i == 1){
     for(j in 1:100){
       mapdeck_update(map_id = "map") %>%
         clear_scatterplot(paste0("scatter",j))
     }
   }

   # Plot a chunk
   df <- plot_data()[(sq[i]+1):sq[i+1],]
   mapdeck_update(map_id = "map") %>%
     add_scatterplot(
       data = df,
       ... options ...
       layer_id=paste0("scatter",i)
     )

  # Bump index
   updateNumericInput(session, "index",value=i+1)
  })

  # Determine what to plot based on user input
  observe({
    ... do stuff to decide what to plot...
    plot_data(df_to_plot)
  })

ui.R

...
    numericInput( # can be hidden inside a conditional panel
      "index",
      "index",
      value=-1
    )
...

The advantage of this solution is that I can provide a progress bar, and cancel/restart rendering if the user selects a different option.

I'm a bit flummoxed with how hard these failures are to diagnose. I can't be the only one torturing Shiny with outsized data, right?

Thank you so much for your amazing work and super fast help in coming up with a workable resolution.

jcheng5 commented 4 years ago

The "messages discarded from buffer" messages are normal for SSP, ShinyApps.io, and Connect.

I'm a bit flummoxed with how hard these failures are to diagnose. I can't be the only one torturing Shiny with outsized data, right?

In my experience, the only scenarios where people are sending outsized data to the client with Shiny are leaflet and now, I guess, mapdeck. And it's usually polygons that are (too) detailed. The ideal solution, I think, would be using vector tiles.

What's making your calls so heavy in your real app? Is it actually in the popups? If so, can you implement the popups server-side, so not all of the values have to be transferred to the client?

romunov commented 3 years ago

@ifellows you are not alone. My use case is plotting many gene elements (can be several 10000 but biologists would not be averse to plotting all million elements) on a plotly scatterplot. The app dies in a container run from shinyproxy but works fine if I use shiny server. :shrug: Luckily you guys were able to dig deep enough so that I can start thinking of a solution that will work for my use case.

andyquinterom commented 3 years ago

Hello,

This issue is not limited to large maps. I am getting the exact same issue on RStudio Connect with the simple repetition of clicking a button. I cannot reproduce the problem on my laptop with RunApp.

MarkOughton commented 2 years ago

@jcheng5 Hi, I have an on-prem Shiny server and am experiencing this issue. I am trying to render shapefiles ~128Mb. The app works when I am running directly on R server but not when hosted on a webpage. The limit seems to be 64Mb having ran the payload test above. I am new to Ubuntu so do not know whether I need to amend a .conf file to increase the socket size, and if so which one? Or is there an additional line of HTML code I need to place in to server.R, and if so, what is the syntax please?