Open JBGruber opened 5 months ago
This works now in the output branch. I opted to do something between naive and advanced. When you supply a vector of servers, you can assign a name to each, corresponding to what share of requests should be fulfilled by that server. So c("0.6" = "http://localhost:11434/", "0.4" = "http://192.168.2.45:11434/")
will hand 60% of requests to localhost and 40% to the remote computer. It's pretty quick:
library(rollama)
library(tidyverse)
reviews_df <- read_csv("https://raw.githubusercontent.com/AFAgarap/ecommerce-reviews-analysis/master/Womens%20Clothing%20E-Commerce%20Reviews.csv",
show_col_types = FALSE) |>
sample_n(500)
#> New names:
#> • `` -> `...1`
make_query <- function(t) {
tribble(
~role, ~content,
"system", "You assign texts into categories. Answer with just the correct category, which is either {positive}, {neutral} or {negative}.",
"user", t
)
}
start <- Sys.time()
reviews_df_annotated <- reviews_df |>
mutate(query = map(`Review Text`, make_query),
category = query(query, screen = FALSE,
model = "llama3.2:3b-instruct-q8_0",
server = c("0.6" = "http://localhost:11434/",
"0.4" = "http://192.168.2.45:11434/"),
output = "text"))
stop <- Sys.time()
stop - start
#> Time difference of 18.19546 secs
Created on 2024-10-18 with reprex v2.1.0
The same approach implemented in #16 could also be used to send requests to multiple Ollama servers at once to process requests in parallel. There are at least two approaches we could follow:
In 1., the total run time would be determined by the slowest instance. 2. would be much more efficient in scenarios with a mix of fast and slow machines, but also harder to implement.