Closed GreatEmerald closed 4 years ago
Git blame points to @appelmar for this.
I checked and the same would be achieved by just keeping .bfastpp.full()
(i.e. the original function bfastpp
) and running model.matrix()
on the result to get the matrix version of it, according to the formula. This would also fix the seasonal dummies. Is there any particular reason why it is not done that way at the moment?
As far as I remember, the reason for separate functions was just to touch as little as possible from the old (non-optimized) implementation. I fully agree that a unified implementation with less redundant code should be preferred. I think avoiding the factor conversion was simply a minor performance optimization. HTH!
OK, good to know! I need the seasonal dummies to work properly, since another feature depends on it. So I'll make a PR, than ask you for a review.
I see, that was indeed a deliberate optimisation that saves 1 ms. But the problem is that the solution is not featureful enough, as it doesn't expand dummies and also it doesn't do regular parsing of the formula, so e.g. response~I(trend^2)+harmon
doesn't work altogether (trend gets omitted). If this is removed, the rest of the bfastmonitor optimisations are still working fine, so we don't lose so much performance:
Unit: milliseconds
expr min lq mean median uq max neval
bfast:::.bfastmonitor.default(NDVIa, start = c(2010, 13)) 14.142997 16.040745 17.71687 16.544131 16.99268 246.62345 1000
.bfastmonitor.matrix(NDVIa, start = c(2010, 13)) 9.388053 10.843289 11.66023 11.193384 11.50773 34.69414 1000
bfast:::.bfastmonitor.matrix(NDVIa, start = c(2010, 13)) 8.314812 9.590819 10.42102 9.878581 10.13878 35.34907 1000
Middle is the version I'm proposing. I think that's fine for now, a hit of 1 ms is unfortunate but worth it for the correct interpretation of the formula.
This is fixed in #52.
A lot of code is the same between
.bfastpp.full()
and.bfastpp.formula()
, but unfortunately not everything. That leads to inconsistent behaviour, e.g.season
is calculated asfactor(cycle(y))
in the former and as1 + seq(as.numeric(cycle(y[1]))-1
in the latter, which leads to incorrect type (numeric instead of the expected factor).Following DRY principles, the code should be unified.
As an aside but related: bfastpp gives a radically different object when a formula is given, vs when only a
ts
is given. It's extremely confusing, as the user normally assumes that giving a formula would simply filter out the undesired columns from the result.