When creating several lags at the same time for a given variable, I've found that using a map+partial structure is around 7 times faster when working with big datasets and multiple lags (I tried with 16M rows and 10 lags). It could be worth it to check it out.
When creating several lags at the same time for a given variable, I've found that using a
map+partial
structure is around 7 times faster when working with big datasets and multiple lags (I tried with 16M rows and 10 lags). It could be worth it to check it out.For your reference, this is the function I built:
calculate_lags <- function(df, var, lags){ map_lag <- lags %>% map(~partial(lag, n = .x)) return(df %>% mutate(across(.cols = {{var}}, .fns = map_lag, .names = "{.col}_lag{lags}"))) }
Edit: I don't know why it doesn't respect indentation...