JosiahParry / pathattr

Making R code 200-300x faster using Rust
7 stars 0 forks source link

Possible improvement for R #1

Open TimTaylor opened 1 year ago

TimTaylor commented 1 year ago

Not on mastodon but saw your toot and think the following approach (assuming it's correct) is likely quicker:

library(data.table)
lu <- c(fb = 0.2, tiktok = 0.1, gda = 0.3, yt = 0.10, gs = 0.60, rtl = 0.05, blog = 0.09)
touches <- strsplit(path_data$path, ">", fixed = TRUE)
lt <- lengths(touches)
groups <- rep.int(seq_along(touches), lt)
outcome <- rep.int(path_data$leads, lt)
value <- rep.int(path_data$value, lt) 
touches <- unlist(touches)
dates <- unlist(strsplit(path_data$dates, ">", fixed = TRUE))
not_empty <- touches != ''
dates <- dates[not_empty] 
touches <- touches[not_empty]
re <- lu[touches]
DT <- data.table(
    channel_name = touches,
    outcome = outcome,
    date = dates,
    re,
    value,
    groups
)
DT[, re_tot := sum(re, na.rm = TRUE), by = groups]
DT[, `:=`(conversion = outcome * re / re_tot, value = value * re / re_tot)]
DT[,.(channel_name, re, conversion, value, date)]

Apologies for the noise if I've made a mistake

JosiahParry commented 1 year ago

holy smokes batman!

JosiahParry commented 1 year ago

This is a VERY great example of how the math functionality in base R is insanely fast and often its a matter of making the problem vectorized.

TimTaylor commented 1 year ago

Aye. The rust code is super nice and readable though so still swings and roundabouts!