Open dajmcdon opened 8 months ago
That mean median comparison though.
When I search for gcd_num
, the only time it shows up is in guess_period
, where it is handed a list of diffs between dates, converted to numeric. So I'd expect it to run once per loop of epix_process
, which doesn't seem likely to eat too much time.
The lack of tests may mean there's a bug lurking, so that would be a potential gain from switching over to a non-DIY implementation.
Tempting speedups and testedness, but Rcpp also means potential extra installation hassle for some users and potential license headaches (assuming we don't already transitively depend on Rcpp). Without evidence that this impacts epix_slide()
performance significantly or of a bug, I'm not sure whether we should proceed.
We may not even want to have a GCD algorithm at all. It seems like if the guessed period doesn't appear in the diffs of the unique times, then we have some really weird/wrong data. And if we make that an error, then we can move from calculating the GCD to taking the smallest diff and simply checking that it is a CD.
Actually, further, we might not need a GCD at all. I'm not sure why I used this for epix_slide
default ref_time_values
(i.e. versions).
And we're not using GCD in guess_time_type()
either.
[But we might actually be missing a guess_period()
in next_after()
impls... though that mainly only messes up.... guess_period()
later, and trying to guess the period to create new times and being wrong may go south when using time types that don't enforce the period.]
assuming we don't already transitively depend on Rcpp
We do. I wonder if tsibble is pushed into GPL because of Rcpp.
r$> pak::pkg_deps_explain("cmu-delphi/epiprocess", "Rcpp")
epiprocess -> tsibble -> anytime -> Rcpp
(Side note: we could reduce runtime by 60% by using base R for checks instead of checkmate, and 70%+ by omitting some checks if possible. Still not sure we even want gcd at all anymore though. But this brings to attention that we should take care about validation in inner loops.)
I'm not entirely sure what the use case for
gcd_num()
is, but is it worth implementing differently? We don't currently import Rcpp (which the second version uses):Created on 2024-03-06 with reprex v2.1.0