Open dlfivefifty opened 2 years ago
To understand properly, this discussion is entirely about how to do the solve of a diagonally banded block banded matrix? What's the current way things are solved? QR on the whole matrix?
For general banded-block-banded matrix it just reduces to block-banded QR, that is, it treats the blocks as dense. This is because there is invariably fill-in. This is an O(N^2) = O(n^4)
algorithm (where N
is total degrees of freedom, n
is the number of blocks). (In theory it can be reduced to O(N^(3/2))
using hierarchical methods but this may be unstable.)
At the moment diagonal-block-banded is handled the same way. Though using any of the proposals above it reduces to O(N)
. Also it should be easy to multithread (though we'll see whether my banded QR is actually multithread safe....)
ok! And in option (1), each individual block would be copied to a banded matrix and then a QR factorization is applied? So there would be n
QR factorizations in both (1) and (3), where n
is the number of blocks?
I guess te advantage of (1) is that it can be applied to any problem that induces a diagonal banded block banded matrix. Rather than constraining ourselves to the ModalInterlace
case.
I suppose to answer this I'd need to have a better understanding of howModalInterlace
is meant to be used - the tests are a bit sparse, any more natural examples?
Lacking this, I'd also lean towards (1).
I guess te advantage of (1) is that it can be applied to any problem that induces a diagonal banded block banded matrix. Rather than constraining ourselves to the ModalInterlace case.
Exactly. The negative is you take banded matrices with simple rational formulae and nice memory sequential properties, rearrange as a BandedBlockBandedMatrix
, and then break apart by copying. Whether this extra rearranging has a significant impact on performance is not clear to me.
Rotationally invariant operators are typically represented by
ModalInterlace
which combines matrices acting only on Fourier modes into a diagonal-block-banded matrix (that is, block banded matrix with diagonal blocks, i.e. subblockbandwidths are(0,0)
).At the moment we aren't taking advantage of this structure. There are two approaches:
BandedBlockBandedMatrix
by checking whether the subblockbandwidths are(0,0)
. The easiest way would be to copy to a banded matrix, do a QR, and then copy the data back.DiagonalBlockBandedMatrix
though this doesn't seem to have any obvious benefits over (1).ModalInterlace
. This has the benefit of being able to callqr
on each of the banded operators separately (and potentially taking advantage of@threads
). Though probably (1) could also be parallelised.@TSGut @ioannisPApapadopoulos any thoughts? Do you think this is important right now? It won't help with the variable coefficients problem but could be important for fractional DEs and time-stepping.