Optimization on unordered fusion lasso

Banana1530 commented 6 years ago

For special weight matrices, we can provide even more efficient algorithms than the general-purpose ADMM or AMA methods.

For w_ij are all equal, i.e., the graph is unweighted, we can use the algorithm described in The DFS Fused Lasso: Linear-Time Denoising over General Graphs, which has O(m+n) complexity, where m is the number of edges and n the number of nodes.

michaelweylandt commented 6 years ago

As discussed in the review of PR #11, our current UFL implementation is a bit wasteful memory wise and might be easier to speed up using "vector" style calculations rather than matrix multiplies.

Banana1530 commented 6 years ago

Just to be clear, the ADMM or AMA does not involve matrix multiplication of those triangular matrices. I would suggest we first transform the weight matrix provided by users to a vector on the R side.

Comparison of the two algorithms. Judging from theoretic time complexity, as dimension grows, DP is faster (O(n)). But considering the nature of path algorithm, I observe that when lambda is relatively small (fewer knots encountered before the algorithm terminated), path algorithm is advantageous for length < 1000.
Improvement of the heap implementation. In the sift-up and sift-down operation I use a simple swap function. Actually extra copying can be avoided. For example, in a chain a -> b -> c -> d, to sift a to the right most, we can first store a somewhere else, and then move b, c and d to the left sequentially, then plug a into the slot where d was. What we are doing in the code is swap ab, then ac, then ad.

DataSlingers / MoMA

Optimization on unordered fusion lasso #12