JosiahParry / quarto-site

attempting to build a new site using quarto
2 stars 2 forks source link

posts/2023-08-22-valve-for-production/2023-08-22-valve-for-production #7

Open utterances-bot opened 10 months ago

utterances-bot commented 10 months ago

Josiah Parry - Valve: putting R in production

http://josiahparry.com/posts/2023-08-22-valve-for-production/2023-08-22-valve-for-production

joscani commented 10 months ago

Great post. I'll try to reproduce a complete example and makes some tests

JosiahParry commented 10 months ago

@joscani please do test it! I would like to see how it works for you :)

joscani commented 10 months ago

Very interesting. Here is some of my code to predict using plumber a brms model

# in bash

# valve -f plumber.R -n 5 -w 5 --n-min 3

library(tidyverse)
library(furrr)
plan(multisession, workers = 5)
options(future.rng.onMisuse="ignore") # for RcppSimdJson issues

test <-  read_csv(here::here("data/test_local.csv"))

test

predict_with_valve <- function(port, test) {

    base_url <- "http://127.0.0.1:"
    test_json = jsonify::to_json(test)

    api_res <- httr::POST(url = paste0(base_url, port, "/predict"),
                          body = test_json,
                          encode = "raw")
    predicted_values <- httr::content(api_res, as = "text", encoding = "UTF-8")
    # RcppSimdJson is faster
    RcppSimdJson::fparse(predicted_values)
}

n <- 10
big_test <- do.call("rbind", replicate(n, test, simplify = FALSE))

big_test
# 6560 rows, 5 columns

start <- Sys.time()
multi <- future_map(1:5, ~ predict_with_valve(3000, big_test))
multi_total <- Sys.time() - start

rm(multi)
gc()

start <- Sys.time()
single <- future_map(1:5, ~ predict_with_valve(6140,big_test))
single_total <- Sys.time() - start

rm(single)
gc()

multi_total
single_total

The result

> multi_total
Time difference of 44.06464 secs
> single_total
Time difference of 1.53961 mins
joscani commented 10 months ago

What do you think about async with future in plumber?

JosiahParry commented 10 months ago

@joscani that looks like a great performance enhancement! I have ran into similar bottlenecks. I think when the amount of data transfer gets bigger there are I/O bottle necks not computational time.

i haven't used async with future inside of plumber. I think this can be really good. I do know that @mikemahoney218 has mentioned getting huge improvements using memoise to chache responses. I think that could be a good first thing to look at