rexyai / RestRserve

R web API framework for building high-performance microservices and app backends
https://restrserve.org
271 stars 31 forks source link

Background process doesn't kill correctly the childs #209

Closed AbrJA closed 3 months ago

AbrJA commented 4 months ago

Hi @dselivanov, I hope you are doing well,

Thank you for the package, it's great!

I found an issue running an API using background = TRUE in Backend server$start, the child processes aren't killed when you execute ApplicationProcess$kill.

Let me show you a minimal example:

library(RestRserve)

app = Application$new()

app$add_get(
  path = "/health",
  FUN = function(.req, .res) {
    .res$set_body("OK")
  }
)

backend <- BackendRserve$new()

repeat {
  api <- backend$start(app, http_port = 8001, background = TRUE)
  Sys.sleep(60)
  api$kill()
}

I run this script using Rscript api.R, then I make some requests from R using my browser and Postman. If I check the processes info I see:

ajaimes@ajaimes:~$ ps -ef | grep bin/exec/R
UID          PID       PPID    C STIME TTY  TIME       CMD
ajaimes    66990    4434   0 16:24 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67325   66990  0 16:25 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67459   67325  0 16:25 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67486   67325  0 16:25 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67610   30403  0 16:26 pts/3    00:00:00 grep --color=auto bin/exec/R

After the 60 seconds when api was killed and restarted, I check again the processes and I see:

ajaimes@ajaimes:~$ ps -ef | grep bin/exec/R
UID          PID       PPID    C STIME TTY  TIME       CMD
ajaimes    66990    4434  0 16:24 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67459    1955  0 16:25 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67486    1955  0 16:25 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67633   66990  0 16:26 pts/0    00:00:00 /opt/R/4.2.2/lib/R/bin/exec/R --no-echo --no-restore --file=api.R
ajaimes    67653   30403  0 16:26 pts/3    00:00:00 grep --color=auto bin/exec/R

As you can see the processes 67459 and 67486 are still alive and the PPID has changed from 67325 to 1955.

Looking into your code I found the issue is hereThe PPID is killed first and when this happens another (and different) PPID is assigned to the orphan processes

I tried just changing the order, first killing the childs and after that kill the service but I don't know why it doesn't work properly. Usually it works just the first kill and restart but after that it starts to fail letting orphans the processes.

kill = function(signal = 15L) {
      # kill childs
      system(sprintf("pkill -%s -P %s", signal, self$pid), wait = FALSE)
      # kill service
      tools::pskill(self$pid, signal)
    }
  )

After trying some ideas what I found works properly is (It is not the most elegant):

kill = function(signal = 15L) {
      # get childs
      child_pids <- suppressWarnings(system(sprintf("pgrep -P %s", self$pid), intern = TRUE))
      # kill all
      tools::pskill(c(self$pid, child_pids), signal)
    }

The ugliest part is the suppress warnings function but it's needed because system with intern = TRUE returns a warning if it doesn't find any child process.

Any comment please let me know,

And again thanks for all your contributions,

Abraham

sessionInfo() R version 4.2.2 (2022-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale: [1] LC_CTYPE=es_MX.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_MX.UTF-8 LC_COLLATE=es_MX.UTF-8
[5] LC_MONETARY=es_MX.UTF-8 LC_MESSAGES=es_MX.UTF-8
[7] LC_PAPER=es_MX.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_MX.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods
[7] base

other attached packages: [1] RestRserve_1.2.2

loaded via a namespace (and not attached): [1] compiler_4.2.2 backports_1.4.1 R6_2.5.1 parallel_4.2.2 [5] tools_4.2.2 Rcpp_1.0.12 uuid_1.2-0 Rserve_1.8-13
[9] checkmate_2.3.1 jsonlite_1.8.8 digest_0.6.35 mime_0.12

dselivanov commented 3 months ago

fixed in #210