tidyverts / fabletools

General fable features useful for extension packages
http://fabletools.tidyverts.org/
89 stars 31 forks source link

Improve memory usage of reconciled forecasting #160

Closed mitchelloharawild closed 4 years ago

mitchelloharawild commented 4 years ago

Ref: https://stackoverflow.com/questions/60376315/reconciliation-in-fable-runs-out-of-ram

mitchelloharawild commented 4 years ago

Seems to be from build_smat(), which is what I had expected.

mitchelloharawild commented 4 years ago

The major issue causing the memory issues here is that in the current algorithm the hierarchical structure is first assumed grouped (giving a fully grouped smat), and then unused structure is removed (giving the desired structure). For deeply nested hierarchies like shown on SO, computing the grouped structure leads to very large matrices.

As an aside, it would also be nice to support sparse matrices in this algorithm.

mitchelloharawild commented 4 years ago

I've written an experimental approach for computing the arbitrary summation matrix.

The old approach uses column-matrix products, which can be slow for larger hierarchies. Essentially this old method built summation matrices for each key, and then combines (as groups) all keys using matrix operations across columns.

The new candidate approach in the fabletools@smat branch combines layers of aggregation by rows, avoiding the need for matrix multiplication. By doing this, unused combinations are never computed and so grouped structures don't require discarding to give nested hierarchies. Some minimal testing suggests these two approaches are equivalent, however more checks are required before moving to this new algorithm. It seems to work slower for simpler hierarchies (no optimisation done yet, this should be better later), but much much faster for deeply nested hierarchies.

mitchelloharawild commented 4 years ago

Note: update SO post when available on CRAN.

mitchelloharawild commented 4 years ago

Merged into master and now on CRAN.