Hierarchy measures: speed vs memory?

bmcfee commented 2 years ago

Most of the hierarchy measures are defined in terms of a meet matrix. This is a square matrix where each row and column corresponds to a frame (default, sampled at 10Hz per the segment metrics), and the value is a small integer corresponding to the maximum depth within the hierarchy at which frames i,j receive the same label. We do this twice, once for the reference and once for the estimate.

For a typical ~4min track, this object is of size 2400x2400, so not huge, but not tiny either. In most cases, most of the matrix is empty, so we store it as a CSR (compressed sparse rows) matrix. However, we don't know the sparsity pattern in advance, so it is first built as a LIL matrix and then converted to CSR, as seen here (mir_eval.hierarchy._meet): https://github.com/craffel/mir_eval/blob/576aad4e0b5931e7c697c078a1153c99b885c64f/mir_eval/hierarchy.py#L215-L236

Rather than iterating over rows or columns, we use slice notation to assign blocks at a time. It turns out that this is horrendously slow in LIL matrices, and comprises the major bottleneck in the evaluation. The subsequent processing mainly involves sorting and inversion counting, and I believe is about as fast as can be (cf #201 and #225).

If we're willing to sacrifice memory for speed, it's simple enough to replace the initial LIL matrix by an ndarray. (I've done this in a local branch, and it is indeed significantly faster and sufficient for my current needs.) This hardly seems like an ideal solution though, as it completely defeats the purpose of constraining memory by going through a sparse representation. I expect that with a bit of cleverness, we could probably bypass this initial lil/dense issue and go directly to CSR format.

craffel commented 2 years ago

Thanks Brian. Maybe as a preliminary step we could add an arg that specifies whether to use a LIL matrix or an ndarray?

bmcfee commented 2 years ago

we could add an arg that specifies whether to use a LIL matrix or an ndarray?

I thought about that. It's doable, but kind of ugly from an API perspective since it would have to thread through the entire module.

craffel / mir_eval

Hierarchy measures: speed vs memory? #340