ENH: Decrease wall time of ``_get_PW_matrices`` for ``BottomUp`` and ``BottomUpSparse``

christophertitchen commented 4 months ago

This PR significantly reduces the wall time and memory utilisation of bottom-up reconciliation for both dense and sparse "summing" matrices, $S$. The functionality provided by this package is helpful, but I found the reconcilers impractical to use for the large hierarchical and grouped structures that I encounter in my job, so my plan is to create a few PRs to improve the performance and functionality where possible for this use case of mine in particular.

Note: $P$ refers to the matrix that maps the base forecasts to the bottom level, such that $\tilde{y}_{h} = SP\hat{y}_{h}$. I was originally a bit confused by the notation in this package because as these are linear methods, $P$ would usually represent the projection matrix which contains the leverages and map the base forecasts onto the coherent subspace, which in this case is actually $SP$. I usually see this matrix referred to as $G$ in the literature, but I know Hyndman is doing some work on reconciliation notation so perhaps this is something to be standardised in the future. 😊

This improvement is achieved through the following modifications to the _get_PW_matrices method for BottomUp and BottomUpSparse:

Removing the slicing using idx_bottom to get the elements of $S$ and set the elements of $P$, instead implementing a more efficient instantiation and allocation of $P$ using the upper diagonal functionality in np.eye and scipy.sparse.eye to correctly set $P$ as the concatenation of a null matrix, $0_{n_{b}, \, n_{a}}$, and identity matrix, $I_{n_{b}}$. This also has the added benefit of avoiding an albeit minimally expensive transpose in the process as we can avoid changing the shape and stride of $P$, but adds the constraint that the indices corresponding to the bottom level nodes in the hierarchical or grouped structure need to be ordered, such that $S$ equals $\begin{bsmallmatrix} A \\ I_{n_{b}} \end{bsmallmatrix}$, which is not an issue as it is the standard formulation anyway.
As the "weight" matrix, $W$, is not needed for point reconciliation using single level approaches and only required for probabilistic reconciliation in this context, avoiding instantiating and allocating $W$ unless an intervals_method is passed in the fit method reduces the wall time and memory utilisation of point reconciliation.

These changes effectively deprecates the idx_bottom parameter for BottomUp and BottomUpSparse, which although used in the following situations, does not change the output assuming $S$ is standard:

A slice of $S$ and $\hat{y}{h}$ are used for the bottom-up reconciliation from the anchor level in middle-out reconciliation using MiddleOut. The slice of $S$ is currently corrected to account for the new bottom level by using np.unique, but now requires an additional left-right flip to have $S$ equal to $`\begin{bsmallmatrix} A \ I{n_{b}} \end{bsmallmatrix}`$ as np.unique does not preserve the order. This operation is extremely fast and takes constant time as it returns a view of $S$, so there is no performance degradation.
As non-negative min trace reconciliation using MinTrace uses the whole bottom level, $P$ is identical regardless.

The benchmarks for _get_PW_matrices across a few popular hierarchical datasets of different sizes are below, measured on a machine with 6/12 x 2.6 GHz (dynamic frequency scaling to 4.5 GHz) cores and 16 GB RAM.

Reconciler	Dataset	$`\textup{dim} \left ( P \right )`$	Current	PR	Wall Time Reduction
`BottomUp`	Labour	$`\left( 32 \times 57 \right)`$	$`10.7\, \mu s`$	$`2.18\, \mu s`$	$`79.6 \%`$
`BottomUp`	Wiki2	$`\left( 150 \times 199 \right)`$	$`47.3\, \mu s`$	$`6.57\, \mu s`$	$`86.1 \%`$
`BottomUp`	TourismLarge	$`\left( 304 \times 555 \right)`$	$`171\, \mu s`$	$`19.6\, \mu s`$	$`88.5 \%`$
`BottomUp`	M5	$`\left( 30\, 490 \times 42\, 840 \right)`$	$`12.1\, s`$	$`66\, ms`$	$`99.5 \%`$

`BottomUpSparse`	Labour	$`\left( 32 \times 57 \right)`$	$`601\, \mu s`$	$`189\, \mu s`$	$`68.6 \%`$
`BottomUpSparse`	Wiki2	$`\left( 150 \times 199 \right)`$	$`1.68\, ms`$	$`192\, \mu s`$	$`88.6 \%`$
`BottomUpSparse`	TourismLarge	$`\left( 304 \times 555 \right)`$	$`4.83\, ms`$	$`198\, \mu s`$	$`95.9 \%`$
`BottomUpSparse`	M5	$`\left( 30\, 490 \times 42\, 840 \right)`$	$`41\, s`$	$`578\, \mu s`$	$`100 \%`$

review-notebook-app[bot] commented 4 months ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

CLAassistant commented 4 months ago

All committers have signed the CLA.

elephaint commented 4 months ago

@jmoralez Happy to review but I suggest to first merge #264?

Nixtla / hierarchicalforecast

ENH: Decrease wall time of ``_get_PW_matrices`` for ``BottomUp`` and ``BottomUpSparse`` #276