Closed wsdewitt closed 3 years ago
I dunno, I'd leave this out for now, unless you really really want it. Trend filtering is l1 penalty on second differences, leading it to prefer piecewise linear trajectories rather than piecewise constant. If you think that's a better prior, then maybe so. But it does become a zoo of various penalties to work with, so you can go down the rabbit hole of different penalty functions for a long time. I like the ones we've used here for our first work.
I think there's got to be some way of computing the prox of the trend filtering penalty fairly efficiently.
Yeah this is marked as an enhancement, not critically needed. I think it’ll behave better with histories that have varying smoothness, and require one less param (well another param is the order of finite diff, but just an integer)
Trend filtering may be a more flexible and general way to regularize on finite differences (smoothness in time).
The current implementation does total variation (TV) / fused lasso regularization, which is a 0-th order trend filter (1st order finite difference operator). We tack on a spline term to get more smoothness.
A better approach is probably to replace these two terms with a single trend filter term (i.e. L1 penalty on second-order finite differences). Trend filters are provable better than smoothing splines at adapting to local smoothness: https://projecteuclid.org/euclid.aos/1395234979
The hard part is that we don't have a proximal operator for general trend filters, as we do for the TV penalty with the prox-tv python package. Existing trend filter code all assumes least-squares loss.
Another apparently similar method known as total generalized variation (TGV) in the signal processing literature: https://epubs.siam.org/doi/abs/10.1137/090769521?casa_token=DNBx457lt_AAAAAA:q1B4-Tt6c5zkmR5mfgN0xJDqfK-0K9KVzsdNzIVC44tlS3XSX3zmi2w6SGNdY3b9tXF6ODIvSsTb