rstudio / plumber

Turn your R code into a web API.
https://www.rplumber.io
Other
1.39k stars 255 forks source link

Plumber is not releasing memory #938

Open apalacio9502 opened 9 months ago

apalacio9502 commented 9 months ago

Hello,

We have detected that plumber is not releasing the memory when a query ends, for example, at this point we understand that there are two possible problems with the memory.

We solved the first problem by implementing the solution proposed in this comment https://github.com/rstudio/plumber/issues/496#issuecomment-541402503

We have tried to solve the second problem by adding a postserialize hook that will run gc(). This solution generated a significant change because it frees the objects used in the API. However, we have detected that the memory consumed by the file that is being delivered to the user is not released since the postserialize occurs before the delivery of information to the user is completed.

We have detected this problem because in our case we always have an active process to attend to the API requests. If we did not have an always active process this problem would not be a concern.

Below I leave the code that we have used to test the problem, with the postserialize solution.

library(plumber)
library(readr)

#* @apiTitle Plumber Test Memory Leak

#* Download data
#* @get /download_data
#* @serializer csv
function() {
data <- data.frame(
COLUMN_1 = rep("COLUMN_1",1e+8),
COLUMN_2 = rep("COLUMN_2",1e+8),
COLUMN_3 = rep("COLUMN_3",1e+8),
COLUMN_4 = rep("COLUMN_4",1e+8),
COLUMN_5 = rep("COLUMN_5",1e+8)
)
return(data)
}

#* @plumber
function(pr) {
  pr %>%
    pr_hook("postserialize", function(req) {
      if (req$PATH_INFO != "/" & req$PATH_INFO != "/openapi.json") {
        later::later(function() {
          gc(reset = TRUE, full = TRUE)
        }, delay = 0)
      }
    })
}

I think this topic becomes relevant when we talk about APIs in a productive environment with high demand (hosted in posit connect for example).

Regards,

schloerke commented 9 months ago

I can not definitively give fault to either R, plumber, readr, or any other package needed for execution (httpuv, R6, etc.).

Following https://memtools.r-lib.org/articles/memtools.html#comparing-snapshots-to-detect-leaked-objects , I did not find any evidence to support that there were any leaked objects.

I was not able to follow https://memtools.r-lib.org/articles/memtools.html#snapshotting-the-precious-list as it required more lldb skills than I have time.

Using your simple reprex app, if you are able to point in a definitive direction, I am happy to pursue code changes in any of the R packages in the execution stack.

apalacio9502 commented 9 months ago

Hello @schloerke,

Thanks for your answer.

I was trying to do the tests with memtools and when I tried to add mem_snapshot inside of the plumber function it gave me the following error! Internal error in r_vec_deref_const(): Unimplemented type expression. Surely I'm doing something wrong, you have an example of how memtools can be used with plumber to be able to do tests and try to find the cause of the problem I'm mentioning.

Regards,

GreenGrassBlueOcean commented 8 months ago

hello @schloerke and @apalacio9502

Hi I am having the same issue here. I have the idea that the issue is correlated with this issue: RestRserve

mitch-tanney-coterie commented 8 months ago

Our team has observed similar patterns recently with a docker-based deployment. The screenshot below from Grafana summarizes recent memory usage for a plumber API that serves predictions for 2 different models.

image

After deployment, the service ran at ~330 MiB. We then submitted a large request to one of the models which caused memory to spike as expected. We waited a few hours before submitting a single request to the other model endpoint. Memory dropped as expected. However, something is still being held in memory because the service is now using ~464 MiB. We've already implemented the change to jemalloc per the recommendation in Issue 496. We've also added gc() at the start and just before the return() statement for each model endpoint.

Our questions, therefore, are the following:

GreenGrassBlueOcean commented 8 months ago

I have been studying this a little further and I noticed that when using the future.callr package the memory accumulation decreases. My application performed significantly better and did not crash anymore Maybe you can check this and let me know if you @mitch-tanney-coterie also observe this?

Please find below a minimal example. Each endpoint should have a similar future promise construction.

#
# This is a Plumber API. You can run the API by clicking
# the 'Run API' button above.
#
# Find out more about building APIs with Plumber here:
#
#    https://www.rplumber.io/
#
library(plumber)
library(future)
library(promises)
library(future.callr)

future.seed = TRUE
future::plan(future.callr::callr())
print("planned session future.callr::callr")

#* Echo back the input
#* @param msg The message to echo
#* @get /echo
function(msg = "") {
  promises::future_promise({
   message(paste0("The message is: '", msg, "'"))
   msg = paste0("The message is: '", msg, "'")
   }, seed = TRUE)
}
mitch-tanney-coterie commented 8 months ago

@GreenGrassBlueOcean, thank you for the recommendation, and I apologize for the delayed reply here. Since my last post, we have continued to monitor our API in a test environment where memory seems to be stable at ~ 465 MiB. Before we complete any additional updates, I have several follow-up questions for you (no immediate rush whatsoever).

apalacio9502 commented 8 months ago

Hi @GreenGrassBlueOcean and @mitch-tanney-coterie,

I haven't tried the solution proposed by @GreenGrassBlueOcean to implement future.callr. However, it makes sense to me that this could solve the problem, as when using callr futures, each future is resolved in a fresh background R session that ends as soon as the future's value has been collected. Nevertheless, I believe this solution might not be the most appropriate, as creating a background process each time an API request is made could potentially cause unnecessary delays in the API.

I believe that the initial solution implementing jecmalloc and a postserializer performing garbage collection is more appropriate. However, the issue of the memory consumed by the data delivered to users, which is not freed, still remains

Regards,

GreenGrassBlueOcean commented 8 months ago

hi @apalacio9502 ,

I just noticed your response. I had at first the very same concern. But please try it out with the echo example from above. I was not able to notice a noticeable difference in response time.

What however is an issue is that just as with parallell r-sessions you have to be very specific what options, variables and packages are available to the session. So you have to very carefully make sure that your code is in line with the recommendation by the future package author at: vignette: A Future for R: Common Issues with Solutions

GreenGrassBlueOcean commented 8 months ago

hi @mitch-tanney-coterie,

I just noticed your post as well. I can not share much about the use case. unfortunately. But it is a simple rest application that gets a command makes a calculation in r and returns the response.

I noticed that plumber was not able to keep up with multiple calls at the same time. Furthermore I noticed that during it's 24 hours run (I always let the application run for max 24 hours) It stopped working. So for 20 hours it was doing what it should be doing (as long as it was not called at the same time) and then it suddenly stopped (most likely by a memory issue, as it began showing errors about r temp files being not available).

I then switched over to future.callr in combination with promises and my issues disappeared like magic. Now already running the application for 3 weeks now without a single issue. (like magic)

I noticed you run a docker image. For this application I don't run it in docker (because of the nature of this application) but on native windows.

However for docker I would recommend:

  1. test it in a r-session on your local pc, see if your application works as planned (options, variables, packages as described in the vignette in my previous post to @apalacio9502 )
  2. Create your own docker image from R. I use this one : https://github.com/openanalytics/r-base. You can easily rewrite this to the current r-version.
  3. Build a testing docker to see if it builds and then try to run your application. if you get a build error try to change packages and dependencies, play around a couple of times. Eventually it will build and work.
  4. Monitor the building of your application in something like Jenkins, apache airflow or github actions.

From my experience with my docker images this is generally the best way to built these.

tylerlittlefield commented 7 months ago

This has been a problem for me and happy to say its resolved thanks to this thread. I'll share my experience below.

Added the following to the plumber Dockerfile:

RUN apt-get update && apt-get install -y --no-install-recommends libssl-dev libjemalloc-dev
ENV LD_PRELOAD /usr/lib/x86_64-linux-gnu/libjemalloc.so

Added the following to plumber.R:

plan(future.callr::callr()) # <-- this used to be plan(multisession)

With these two changes, I finally see the memory released back to the node:

image

schloerke commented 7 months ago

PR Request:

Add FAQ entry (or small vignette) suggesting linux users to use:

The instructions / graphic from https://github.com/rstudio/plumber/issues/938#issuecomment-1962131908 could almost be taken verbatim. (Thank you, @tylerlittlefield !)