BergelsonLab / blabr

Other
0 stars 3 forks source link

rewrite fixations_to_timepoints (fka binifyFixations) using join_by #44

Open kalenkovich opened 2 months ago

kalenkovich commented 2 months ago

Note: fixations_to_timepoints isn't yet implemented at all.

t_series <- fixations %>%
  summarise(t_min = min(current_fix_start),
            t_max = max(current_fix_end)) %>%
  mutate(across(c(t_min, t_max),
                ~ floor(. / bin_size) * bin_size)) %>%
  mutate(t = list(seq(t_min, t_max, by = bin_size))) %>%
  select(t) %>%
  unnest(cols = t)

t_series %>%
  inner_join(
    fixations %>%
      mutate(across(c(current_fix_start, current_fix_end),
                    ~ floor(. / bin_size) * bin_size)),
    by = join_by(between(t, current_fix_start, current_fix_end))
  )

Update June 7 2024

The main part (speeding up by switching to a join) was done in a632aa0f.

kalenkovich commented 2 months ago

Informal testing on ht_seedlings (possibly the largest dataset we have) and the join version takes less than 1 second while the orignal takes around a minute.

kalenkovich commented 2 months ago

Not about the optimization but just to have this checklist somewhere: