rexyai / RestRserve

R web API framework for building high-performance microservices and app backends
https://restrserve.org
271 stars 31 forks source link

Not all errors are caught when processing requests #207

Closed hafen closed 4 months ago

hafen commented 4 months ago

Some errors generated when using packages such as those in the tidyverse are not being caught and are getting passed through as status 200 with an empty body when they should actually be errors.

Consider the following example.

Suppose we want take a dataset and a grouping variable as inputs and do a computation like the following:

ds <- "iris"
by_var <- "Species"

res <- get(ds, .GlobalEnv) |>
  dplyr::group_by(.data[[by_var]]) |>
  dplyr::summarise_all(mean)

Suppose a user provides an input for a non-existent dataset:

# scenario 1: base error
ds <- "a" # dataset that doesn't exist
by_var <- "Species"

res <- get(ds, .GlobalEnv) |>
  dplyr::group_by(.data[[by_var]]) |>
  dplyr::summarise_all(mean)
# Error in get(ds, .GlobalEnv) : object 'a' not found

Now suppose the user provides an input for a grouping variable that doesn't exist:

# scenario 2: rlang error:
ds <- "iris"
by_var <- "a" # variable that doesn't exist

res <- get(ds, .GlobalEnv) |>
  dplyr::group_by(.data[[by_var]]) |>
  dplyr::summarise_all(mean)
# Error in `dplyr::group_by()`:
# ! Must group by variables found in `.data`.
# ✖ Column `a` is not found.
# Run `rlang::last_trace()` to see where the error occurred.

Now, if we put this in RestRserve:

library(RestRserve)

app <- Application$new()

app$add_post(path = "/groupMeans", function(.req, .res) {
  ds <- .req$body$ds
  by_var <- .req$body$by_var

  res <- get(ds, .GlobalEnv) |>
    dplyr::group_by(.data[[by_var]]) |>
    dplyr::summarise_all(mean)
  .res$set_body(res)
})

request1 <- RestRserve::Request$new(
  method = "POST",
  content_type = "application/json",
  path = "/groupMeans",
  body = to_json(list(ds = "a", by_var = "Species"))
)
resp1 <- app$process_request(request1)
# {"timestamp":"2024-03-05 15:47:44.797372","level":"ERROR","name":"Application","pid":49811,"msg":"","context":{"request_id":"c36d0b3a-db4a-11ee-acec-12ba65ed27f2","message":{"error":"object 'a' not found","call":"FUN(request, response)","traceback":["FUN(request, response)",["res <- a |>","    dplyr::group_by(.data$Species) |>","    dplyr::summarise_all(mean)"],"manip_all(.tbl, .funs, enquo(.funs), caller_env(), ..., .caller = \"summarise_all\")","syms(tbl_nongroup_vars(.tbl))","map(x, sym)","lapply(.x, .f, ...)","tbl_nongroup_vars(.tbl)","setdiff(tbl_vars(x), group_vars(x))","tbl_vars(x)","new_sel_vars(tbl_vars_dispatch(x), group_vars(x))",["structure(vars, groups = group_vars, class = c(\"dplyr_sel_vars\", ","    \"character\"))"],"tbl_vars_dispatch(x)",["res <- a |>","    dplyr::group_by(.data$Species) |>","    dplyr::summarise_all(mean)"]]}}}

request2 <- RestRserve::Request$new(
  method = "POST",
  content_type = "application/json",
  path = "/groupMeans",
  body = to_json(list(ds = "iris", by_var = "a"))
)
resp2 <- app$process_request(request2)
resp2$status
# [1] "200 OK"
resp2$body
# [1] ""

In the latter case (request2), I expect to see an error, not a "200 OK" with an empty body, as there is definitely an error. These types of errors are caught using try() so I'm not sure why they aren't caught in server requests.

Session info:

r$> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RestRserve_1.2.1 devtools_2.4.5   usethis_2.2.2    pillar_1.9.0    

loaded via a namespace (and not attached):
 [1] utf8_1.2.4        generics_0.1.3    stringi_1.8.3     digest_0.6.34     magrittr_2.0.3    pkgload_1.3.4    
 [7] fastmap_1.1.1     jsonlite_1.8.8    pkgbuild_1.4.3    sessioninfo_1.2.2 backports_1.4.1   urlchecker_1.0.1 
[13] promises_1.2.1    purrr_1.0.2       fansi_1.0.6       cli_3.6.2         shiny_1.8.0       rlang_1.1.3      
[19] ellipsis_0.3.2    remotes_2.4.2.1   cachem_1.0.8      tools_4.3.2       parallel_4.3.2    uuid_1.2-0       
[25] memoise_2.0.1     checkmate_2.3.1   dplyr_1.1.4       httpuv_1.6.14     vctrs_0.6.5       R6_2.5.1         
[31] mime_0.12         lifecycle_1.0.4   stringr_1.5.1     fs_1.6.3          htmlwidgets_1.6.4 miniUI_0.1.1.1   
[37] pkgconfig_2.0.3   later_1.3.2       glue_1.7.0        profvis_0.3.8     Rcpp_1.0.12       Rserve_1.8-13    
[43] tidyselect_1.2.0  tibble_3.2.1      xtable_1.8-4      htmltools_0.5.7   compiler_4.3.2 
hafen commented 4 months ago

I think the issue is here:

https://github.com/rexyai/RestRserve/blob/2f84b697d1d0827340234b25167e75f519b1a013/R/Application.R#L550

It's checking

inherits(x, "simpleError")

But if you look at base errors vs. errors that come from rlang:

try_capture_stack <- RestRserve:::try_capture_stack
class(try_capture_stack(stop("test")))
# [1] "simpleError" "error"       "condition"
class(try_capture_stack(rlang::abort("test")))
# [1] "rlang_error" "error"       "condition"

Is there any reason that we couldn't change that line to be either:

inherits(x, "error")

or

inherits(x, c("simpleError", "rlang_error"))
dselivanov commented 4 months ago

I think inherits(x, "error") should be good enough. Not sure why rlang decided to omit simpleError class. Thanks for investigation! Would you be able a bmot a PR?