[BlockSparseArrays] Blockwise matrix factorizations

A key missing feature in BlockSparseArrays, as mentioned in #1336, is blockwise matrix factorizations. I'm starting an issue here to sketch out an implementation plan for that.

Here is an initial implementation of a blockwise SVD, using the new submodules of NDTensors.jl:

using BlockArrays: Block, blockedrange, blocks
using Dictionaries: Dictionary, set!
using LinearAlgebra: Diagonal, svd
using NDTensors.BlockSparseArrays: BlockSparseArray, block_stored_indices
using NDTensors.SparseArrayInterface: stored_indices
using NDTensors.SparseArrayDOKs: SparseMatrixDOK

# Convert an array into a boolean array of ones and zeros.
function boolean_array(a::AbstractArray)
  b = similar(a, Bool)
  fill!(b, zero(eltype(b)))
  for I in stored_indices(a)
    b[I] = one(eltype(b))
  end
  return b
end

# Check if the matrix has 1 or fewer entries
# per row/column.
function is_permutation_matrix(a::SparseMatrixDOK)
  keys = collect(Iterators.map(Tuple, stored_indices(a)))
  I = first.(keys)
  J = last.(keys)
  return allunique(I) && allunique(J)
end

function is_block_permutation_matrix(a::AbstractMatrix)
  return is_permutation_matrix(boolean_array(blocks(a)))
end

function block_svd(a::AbstractMatrix)
  @assert is_block_permutation_matrix(a)
  bs = block_stored_indices(a)
  s_blocks = map(i -> Block(i, i), 1:length(bs))
  bs_to_s_blocks = Dictionary(bs, s_blocks)
  us = Dictionary{eltype(bs),Matrix{eltype(a)}}()
  ss = Dictionary{eltype(bs),Matrix{eltype(a)}}()
  vts = Dictionary{eltype(bs),Matrix{eltype(a)}}()
  for b in bs
    usvb = svd(a[b])
    ub = usvb.U
    sb = Matrix(Diagonal(usvb.S))
    vbt = usvb.Vt
    bs = bs_to_s_blocks[b]
    bu = Block(Int.((Tuple(b)[1], Tuple(bs)[1])))
    bvt = Block(Int.((Tuple(bs)[2], Tuple(b)[2])))
    set!(us, bu, ub)
    set!(ss, bs, sb)
    set!(vts, bvt, vbt)
  end
  r_s = blockedrange(map(s_block -> size(ss[s_block], 1), s_blocks))
  u = BlockSparseArray(us, (axes(a, 1), r_s))
  s = BlockSparseArray(ss, (r_s, r_s))
  vt = BlockSparseArray(vts, (r_s, axes(a, 2)))
  return u, s, vt
end

Here's a demonstration that it works:

julia> a = BlockSparseArray{Float64}([2, 3], [2, 3]);

julia> a[Block(2, 1)] = randn(3, 2);

julia> a[Block(1, 2)] = randn(2, 3);

julia> a
typeof(axes) = Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}

Warning: To temporarily circumvent a bug in printing BlockSparseArrays with mixtures of dual and non-dual axes, the types of the dual axes printed below might not be accurate. The types printed above this message are the correct ones.

2×2-blocked 5×5 BlockSparseArray{Float64, 2, Matrix{Float64}, NDTensors.SparseArrayDOKs.SparseArrayDOK{Matrix{Float64}, 2, NDTensors.BlockSparseArrays.BlockZero{Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}}, Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}:
  0.0        0.0       │  -1.98464  0.0925584  -0.307224
  0.0        0.0       │  -1.8799   1.26757    -0.576902
 ──────────────────────┼────────────────────────────────
  0.558952   0.118232  │   0.0      0.0         0.0     
  1.24029    0.503875  │   0.0      0.0         0.0     
 -0.566776  -0.585867  │   0.0      0.0         0.0     

julia> u, s, vt = block_svd(a);

julia> u
typeof(axes) = Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}

Warning: To temporarily circumvent a bug in printing BlockSparseArrays with mixtures of dual and non-dual axes, the types of the dual axes printed below might not be accurate. The types printed above this message are the correct ones.

2×2-blocked 5×4 BlockSparseArray{Float64, 2, Matrix{Float64}, NDTensors.SparseArrayDOKs.SparseArrayDOK{Matrix{Float64}, 2, NDTensors.BlockSparseArrays.BlockZero{Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}}, Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}:
  0.0       0.0       │  -0.642224  -0.766517
  0.0       0.0       │  -0.766517   0.642224
 ─────────────────────┼──────────────────────
 -0.338149  0.438603  │   0.0        0.0     
 -0.815799  0.304476  │   0.0        0.0     
  0.469178  0.845531  │   0.0        0.0     

julia> s
typeof(axes) = Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}

Warning: To temporarily circumvent a bug in printing BlockSparseArrays with mixtures of dual and non-dual axes, the types of the dual axes printed below might not be accurate. The types printed above this message are the correct ones.

2×2-blocked 4×4 BlockSparseArray{Float64, 2, Matrix{Float64}, NDTensors.SparseArrayDOKs.SparseArrayDOK{Matrix{Float64}, 2, NDTensors.BlockSparseArrays.BlockZero{Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}}, Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}:
 1.63656  0.0       │  0.0      0.0     
 0.0      0.323677  │  0.0      0.0     
 ───────────────────┼───────────────────
 0.0      0.0       │  2.97427  0.0     
 0.0      0.0       │  0.0      0.817929

julia> vt
typeof(axes) = Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}

Warning: To temporarily circumvent a bug in printing BlockSparseArrays with mixtures of dual and non-dual axes, the types of the dual axes printed below might not be accurate. The types printed above this message are the correct ones.

2×2-blocked 4×5 BlockSparseArray{Float64, 2, Matrix{Float64}, NDTensors.SparseArrayDOKs.SparseArrayDOK{Matrix{Float64}, 2, NDTensors.BlockSparseArrays.BlockZero{Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}}, Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}:
 -0.896243  -0.443562  │  0.0        0.0        0.0     
  0.443562  -0.896243  │  0.0        0.0        0.0     
 ──────────────────────┼────────────────────────────────
  0.0        0.0       │  0.913015  -0.346659   0.215015
  0.0        0.0       │  0.383828   0.908532  -0.16506 

julia> u * s * vt
typeof(axes) = Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}

Warning: To temporarily circumvent a bug in printing BlockSparseArrays with mixtures of dual and non-dual axes, the types of the dual axes printed below might not be accurate. The types printed above this message are the correct ones.

2×2-blocked 5×5 BlockSparseArray{Float64, 2, Matrix{Float64}, NDTensors.SparseArrayDOKs.SparseArrayDOK{Matrix{Float64}, 2, NDTensors.BlockSparseArrays.BlockZero{Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}}, Tuple{BlockArrays.BlockedOneTo{Int64, Vector{Int64}}, BlockArrays.BlockedOneTo{Int64, Vector{Int64}}}}:
  0.0        0.0       │  -1.98464  0.0925584  -0.307224
  0.0        0.0       │  -1.8799   1.26757    -0.576902
 ──────────────────────┼────────────────────────────────
  0.558952   0.118232  │   0.0      0.0         0.0     
  1.24029    0.503875  │   0.0      0.0         0.0     
 -0.566776  -0.585867  │   0.0      0.0         0.0

There are a number of improvements to make to the code above:

Store S as a BlockSparseMatrix where the blocks are Diagonal or DiagonalMatrix instead of Matrix. That would be relatively easy to support, I just haven't tested that and there are probably a few issues to work out in BlockSparseArrays to support that.
Make it generic to GPU, i.e. infer the block type instead of hardcoding it to Matrix which is on CPU.
Add support for truncating the blocks. I'm picturing that in the simplest case we just call that dense/full rank version above, and then analyze the singular values, determine the ranks of each block, and then slice the blocks using the slicing operation introduced in #1489.
Make it a little more automated to construct the BlockSparseMatrix representations of $U$, $S$, and $V^{\dagger}$. In the code above I first store the results of SVDing each block in dictionaries that store the blocks of $U$, $S$, and $V^{\dagger}$, and then build the BlockSparseMatrix representations of $U$, $S$, and $V^{\dagger}$ from those blocks and also axes determined from the input matrix and the results of the SVD. I think that logic is pretty good and simple but maybe it can be simplified a little bit. For example, we could store the blocks in a SparseMatrixDOK and then convert it to a BlockSparseMatrix, which could automatically determine the block sizes.
Handle BlockSparseMatrix inputs that invariant under group symmetries. Really the only thing that needs to be changed above is adding on symmetry labels/irrep labels to the blocks of the axes that get attached to the internal dimensions (those connecting $S$ to $U$ and $V^{\dagger}$, called r_s in the code above). The logic for that would be pretty simple, and be uniquely determined using the convention that $U$ and $V^{\dagger}$ are in trivial symmetry sectors (have zero flux).

For some references on other implementations of blockwise matrix factorizations, see https://github.com/ITensor/ITensors.jl/blob/v0.6.16/NDTensors/src/lib/BlockSparseArrays/src/backup/qr.jl and https://github.com/JuliaArrays/BlockDiagonals.jl/blob/master/src/linalg.jl.

@ogauthe @emstoudenmire

ITensor / ITensors.jl

[BlockSparseArrays] Blockwise matrix factorizations #1515