pablosanjose commented 10 months ago

Here I'd like to discuss how to represent the block structure (i.e. orbital groups corresponding to each cell/site) of two types of objects:

Bloch Hamiltonians such as h(φs; params...), which currently yields a flat SparseMatrixCSC over all orbitals in one unit cell
Green functions such as g[siteselector](ω), which currently yields a flat Matrix over all orbitals in the LatticeSlice defined by siteselector

The issue is how to recover the substructure in these flat matrices corresponding to each (multiorbital) site or unit cell. Currently this is klunky or impossible. This functionality is necessary in many instances, but in particular to be able to implement Hartree-Fock mean fields #229, since the density matrix (a Green-function-derived object) needs to be site-indexable.

Warning: this is a bit of a heavy-going, internal-developer-dialogue kind of issue.

Bloch Hamiltonians

In master, we already have both a flat and an unflat SparseMatrixCSC representation of each of the Bloch Harmonics h[dcell]. Each of this Harmonic is internally stored as a HybridSparseMatrix that takes care of syncing the flat and block sparse representations of each Harmonic after each update of model parameters.

Currently, if we do h[unflat(dcell)] we get the block-sparse representation of that specific Harmonic. However, we don't yet have a way to do the same for the full Bloch Hamiltonian (sum over harmonics with a given k). In other words, h(unflat(φs); params...) is not defined.

One of the reasons is that I don't quite like the idea of wrapping dcell or φs with unflat for this. I would much prefer to be able to do h[dcell] |> unflat or h(φs; params...) |> unflat. But if h[dcell] and h(φs; params...) are already flat SparseMatrixCSC, they have already lost the information to rebuild them as a block sparse matrix.

I see two options:

Define a macro @unflat, such that @unflat h[dcell] gets lowered to unflat_getindex(h, dcell) or similar, and @unflat h(φs; params...) becomes unflat_bloch!(h, sanitize_SVector(T, φs)) (instead of the curent flat_bloch!(h, sanitize_SVector(T, φs))).
Make both h[dcell] and h(φs; params...) return a result::HybridSparseCSC, such that unflat(result) yields a block-sparse matrix, and perhaps flat(result) yields the current result. To make things non-breaking, we would need to make HybridSparseCSC act in all other respects as a flat SparseMatrixCSC, in regards to getindex, setindex, show, mul!, nonzeros, etc... We could also admit a breaking change, and require the user to call flat or unflat before using the result (maybe the cleanest way).

In both cases, we would deprecate the current h[unflat(dcell)] syntax.

Approach 2 may seem to involve some overhead, since it seems we need to build two versions of the Bloch matrix, where we usually only need the flat one (e.g. for diagonalization or use in a GreenSolver, which is 90\% of the time). But note that HybridSparseCSC is smart, in the sense that it can be initialized with just the flat version of a Matrix, and the unflat one gets built from the former upon calling unflat (only the first time, after that both versions are marked in-sync).

Green functions over a LatticeSlice

The current result of gmat = g[siteselector](ω) or gmat = g[siteselector´, siteselector](ω) is borderline useless when we have several orbitals per site, because we get a flat dense matrix, and no information of what entry corresponds to what orbital. We should actually return something different, which would be a breaking change, but it is probably unavoidable. The question is what should we return.

One of the missing functionalities is the ability to do gmat[sites´, sites], where sites and sites´ are a CellSites (which can be currently created with cellsites(cell_indices, site_indices), already exported). Note that what we can already do is to use those sites in place of siteselector beforehand (i.e. gmat = g[sites´, sites](ω)), but not after construction.

Another functionality is the computation of traces. If gmat = g[siteselector](ω) and we have several orbitals per site, we probably would like tr(gmat) to return an SMatrix, so that trace is a sum over sites, not orbitals. Without this, we cannot easily compute things like the hole component of the LDOS in a superconducting model, or the magnetic moment in a ferromagnet model.

Like for Bloch Hamiltonians I see basically two options

Make an @unflat macro that instead of a flat Matrix, converts gmat = @unflat g[siteselector...](ω) into a Matrix of Matrices/Views/SMatrices.
Make g[siteselector...](ω) return always an HybridDenseMatrix, to be defined, that can itself be cast into a flat or block dense matrix with flat/unflat. The unflat version should probably be a Matrix of Views, since the block size may be variable in the multiorbital case (cannot be an eltype of a fixed SMatrix type).

Indexing

As can probably be read between lines, I am more inclined towards option 2 in both cases. In the case of Bloch Hamiltonians, the unflat of a multiorbital h would yield a SparseMatrixCSC{SMatrixView}, where SMatrixView is a Quantica type represented an SMatrix padded with the required zeros to allow variable block size. This a don't like very much, to be honest. We could perhaps do a quick conversion when calling unflat so that we get a SparseMatrixCSC{SubArray} (i.e. a sparse matrix of views, like in option 2 of the Green function case).

Apart from the flat/unflat functionality, there is another reason why I like option 2: advanced indexing.

We could define view(h::HybridSparseMatrix, sites´, sites), where sites and sites´ are a collection of Integers (site indices in the unit cell). This could return a SubArray{HybridSparseMatrix}, which could be cast into a sparse matrix with flat/unflat, which in this case would make an independent sparse array. Since it doesn't make sense to specify cells for a Bloch matrix, we should then probably rename HybridSparseMatrix to HybridBlochMatrix (this also prepares the way for a potential dense version of a Bloch Matrix, to be discussed).

We could also define view(g::HybridDenseMatrix, sites´, sites), where sites and sites´ are now CellSites, i.e. collections of cell and site indices. This would again create a SubArray{HybridDenseMatrix}, which would be made flat/unflat. For symmetry, we could rename HybridDenseMatrix to e.g. HybridSliceMatrix to emphasize its LatticeSlice domain, instead of its Dense nature. This way we could perhaps leave the door open to a sparse version of HybridSliceMatrix, such as a sparse density matrix with a mask (using SuiteSparseGraphBLAS.jl) If any requested site in view(g::HybridSliceMatrix, sites´, sites) is not contained in the LatticeSlice of g, a BoundsError should probably be thrown.

This defines a view -> flat/unflat workflow. Should we define a getindex too? Probably not, since it is equivalent to establishing a flat or unflat preference, like in current master.

Implementation

The HybridBlochMatrix is essentially implemented already. We just need to adapt the code to (a) call flat in certain places (and export the flat function), and (b) Implement the flat/unflat of SubArray{HybridBlochMatrix}.

The HybridSliceMatrix needs to be implemented from scratch. It requires some constructor that can take a flat Matrix (the current output of all GreenSolvers), a LatticeSlice + OrbitalBlockStructure from the generating parent object, and build with those a Dictionary that can map cell+site indices for sites included in the LatticeSlice to orbital UnitRanges, i.e. indices in the flat Matrix. Then, view(g::HybridSliceMatrix, s´::CellSites, s::CellSites) would find the index ranges by a getindex call to the Dictionary, and then build a view of the flat Matrix with them. If the CellSites are all contiguous (i.e. site_indices are contiguous cell_indices are a single cell), the Dictionary should be able to provide a single index UnitRange, so that the returned SubArray{Matrix} is non-allocating. Otherwise we would need to allocate a Vector of indices of the flat Matrix, and the view will unavoidably allocate.

Note: I mention Dictionary because in considering this implementation it is only natural to consider also a refactor of all our collection of cell objects, such as Harmonics inside Hamiltonian, or LatticeSlice itself. We could convert the current Vector-of-cells to a Dictionary with cell::SVector{L,Int} as keys. The virtue of Dictionary.jl as opposed to Base.Dict is that iteration over values (i.e. cells) is very efficient, which is the reason we used Vectors in Quantica for cell containers in the first place. Moving to Dictionary.jl for this (a very minimal dependence) would not slow down iteration, and would speed up cell indexing considerably (we currently do an O(N) search).

Scalar case

In the scalar case (each site has a single orbital), there is no distinction between flat and unflat. It might seem that in such case we should return a SparseMatrixCSC directly instead of a HybridBlochMatrix. This may be acceptable. However, we would still need to build a HybridSliceMatrix in the case of GreenFunctions, since we need to support cell+site indexing. To avoid surprising behaviors I am inclined to not make exceptions, and make the return type always be HybridBlochMatrix or HybridSliceMatrix, regardless of whether the problem is single or multiorbital.

pablosanjose commented 10 months ago

Direct flat/unflat Indexing

About the indexing functionality above, there is a much better syntax than the proposed one. Instead of doing the following to get the block Matrix of a g_hybrid_mat = g[siteselector](ω)

view(g_hybrid_mat, cellsites(...), cellsites(...)) |> unflat

we could repurpose flat and unflat to do indexing, so the above could be written as

g_hybrid_mat |> unflat(cellsites(...), cellsites(...))

This would immediately return a Matrix{SubArray}. If we use flat instead, we would obtain a SubArray instead.

In other words, there is no need to go though a SubArray of a HybridSliceMatrix, which would not have any meaningful functionality except as an argument to flat/unflat.

We could then extend this also to the HybridBlochMatrix, although in that case we would just support Integer site indices as arguments.

pablosanjose commented 10 months ago

Diagonal indexing

There is a special functionality currently whereby g[diagonal(siteselector)](ω) will return a flat Vector, instead of a flat Matrix. What should we do in this case? We should probably revert to returning a Vector{B} where B is the blocktype of the Hamiltonian. Then HybridSliceMatrix should be parametric in the wrapped array, so we can distinguish HybridSliceMatrix{Matrix{Complex{T}}} from HybridSliceMatrix{Diagonal{B}}. In other words, while the normal case we wrap a flat matrix and reconstruct views of it when calling unflat, in the diagonal case we wrap a Diagonal of blocks, and then flat returns a Diagonal{Complex{T}}, while unflat returns a Diagonal{SubArray}, i.e. a Diagonal of views of the wrapped diagonal. That's the general case. If B is uniform, we could replace the SubArray by B (to be thought through)

The gω[diagonal(..., kernel = K)] functionality would be replaced by an explicit gω[diagonal(...)] |> unflat .|> x->tr(x*K), and/or we could add kernel as a kwarg to flat, so we would use gω[diagonal(...)] |> flat(kernel = K)

pablosanjose commented 10 months ago

I have a lingering worry. While the case of HybridBlochMatrix is clearer, and the plain flat/unflat of a HybridSliceMatrix is also quite transparent, the indexing functionality of a HybridSliceMatrix through flat/unflat or view seems redundant, no? In the sense that HybridSliceMatrix is already the result of a slice. The original intent of GreenSlice was precisely to do the non-trivial indexing over sites, so that if gω = g(ω), we could then do gω[cellsites(...), cellsites(...)] and get a flat matrix over these sites. But now we are considering a re-slicing functionality of the result. Why? What is the usecase that is not addressed by gω[cellsites(...), cellsites(...)]? Does this seem like a design mistake?

pablosanjose commented 10 months ago

Indexing memoization as an alternative to `HybridSliceMatrix`

I think I now see the root of the above concern. The problem is the artificial two-stage indexing of g::GreenFunction that we are considering, such as g[siteselector...](ω)[sites´, sites]. We are making g[siteselector...](ω) a special type (HybridSliceMatrix) to allow for the flat/unflat functionality. And then we are tacking onto that type a special indexing functionality that works like the first siteselector, but acting on HybridSliceMatrix instead of GreenFunction.

What we need is that repeated calls to gω[sites´, sites] to be efficient. Let me explain.

The original reason for considering this two-stage indexing functionality was to support computing ρ = density(g[siteselector...]), and then, later ρ[site´, site] when passing this ρ as a parameter to a Hartree-Fock model.

Instead of the above, what we want is to support ρ = density(g) (without any prior slicing), and then to have ρ[site´, site] be efficient. This means that if we compute ρ[site1, site1] for a given site1, we are getting (with many solvers) also ρ[site2, site1] for all site2 in the same unit cell as site1. So it becomes really wasteful to recompute ρ[site2, site1] for each site2. Hence, we argued, it would be better compute ρ = density(g[siteselector...]) for a large enough site selection, and then index over them when building the Hartree-Fock Hamiltonian.

The correct solution is now much clearer: what we need is a caching mechanism that stores all the ρ[site2, site1] for different site2 and a fixed site1. Then, when we index into some site pairs that have already been computed, we should use the cache.

Interestingly, we in fact already have such a cache functionality. We use it for gω = g(ω) inside CurrentDensitySolution, and it is called GreenSolutionCache.

What we need to do then is to include, inside any indexable type that involves a non-trivial computation (GreenSolution, CurrentDensitySolution, and now DensityMatrix or whatever we call it) a memoizing cache of getindex results. Unlike GreenSolutionCache, which currently stores full slices of the type ρ[cellsites(cell, :), site1] in a Dict, we should store the result between individual sites as views (perhaps in a Dictionary from Dictionary.jl instead of a Dict).

Now, this involves a full reset of the original proposal. What is the need to define a HybridSliceMatrix when we can repeatedly index over individual sites to get the SubArray blocks for them? No, we don't need it anymore. What we need is this general cache mechanism that stores indexing results. It could be called SiteIndexCache. And the result of gω[sites´, sites] should always be a flat AbstractMatrix (probably a SubArray), never an unflat matrix. Iterating over the block for each site pair in sites´⊗sites would not need a recompute, it would be cheap once gω[sites´, sites] has been evaluated.

The special case of diagonal indexing should also be adapted to use the cache. Would need to work out the details.

Revisiting `HybridSparseMatrix` for Bloch Hamiltonian matrices

The above solution implies never returning a Matrix of blocks, but always a flat Matrix over the requested sites. I like this also because we never expose an SMatrixView for cases with non-homogeneous orbital sizes. In view of this, do we still want to return a HybridSparseMatrix instead of a SparseMatrixCSC when calling h[dn] or h(φs; params...)? What exactly would be the use case?

We have encountered one case in real world use, where we needed to compute the topological invariant of a superconducting Hamiltonian. This is the Pfaffian of i σ_y τ_y * H(k) for certain high-symmetry points k. This operation requires block resolution of H(k). But building P = im * σ_y * τ_y * H can be trivially done with @onsite! and @hopping! modifiers, and then pfaffian(P(k)), so we don't really need block-resolved H(k).

I don't really see an imperative need of a block-resolved h(φs; params...) any more. I think that, for the moment, we should therefore concentrate only on the second part of the proposal, modified as explained below

Updated proposal

Don't change how h[dn] and h(φs; params...) work currently
For gω::GreenSolution, jω::CurrentDensitySolution, ρ::DensityMatrix (not yet defined) and any other indexable observable, wrap a SiteIndexCache in the form of a Dictionary mapping either (ni, i, nj, j) indices to SubArrays (general case), or (dn, i, j) indices to SubArrays (translationally invariant systems - i.e. unbounded lattices without contact self-energies)
Make getindex of these type first check the cache if the indices are single sites (but not otherwise?). Also make diagonal indexing check the cache.
Print the current number of cached indices when showing the above types
Explain in the tutorial and docstrings that repeated indexing of types with a SiteIndexCache have no penalty
In the future, when implementing self-consistent mean fields, make ρ::DensityMatrix a non-sliced ρ = density(g::GreenFunction) with a SiteIndexCache

pablosanjose commented 10 months ago

232 has a simpler approach to block access that seems more transparent than the whole SiteIndexCache story. The more I think about it, the more tricky caching looks: what happens if we compute two multisite slices twice? Since we're storing only single-site results, do we read the cache or do a full recompute? It would need to be a full recompute. The cache is probably only useful for second-time single-site indexing, which is a rather restricted case in exchange for considerable code complexity.

Not saying we should not do it at some point if there is a compelling reason. But the reasons given here, including the mean-field implementation, seems to be solved more simply in #232. In particular, that proposal is just a Array-like wrapper, so it is orthogonal to the current indexing of GreenFunctions, which is good. In that sense, it is similar but cleaner than the HybridSliceMatrix approach in this thread. It follows the same logic as repeated slicing of Arrays in Julia.

Hence, I'm closing this for the moment. We could reopen further down the road if we find a good reason for single-site caching.

pablosanjose / Quantica.jl

Taking blocks seriously #230

Bloch Hamiltonians

Green functions over a LatticeSlice

Indexing

Implementation

Scalar case

Direct flat/unflat Indexing

Diagonal indexing

Indexing memoization as an alternative to `HybridSliceMatrix`

Revisiting `HybridSparseMatrix` for Bloch Hamiltonian matrices

Updated proposal

pablosanjose / Quantica.jl

Taking blocks seriously #230

Bloch Hamiltonians

Green functions over a LatticeSlice

Indexing

Implementation

Scalar case

Direct flat/unflat Indexing

Diagonal indexing

Indexing memoization as an alternative to HybridSliceMatrix

Revisiting HybridSparseMatrix for Bloch Hamiltonian matrices

Updated proposal

Indexing memoization as an alternative to `HybridSliceMatrix`

Revisiting `HybridSparseMatrix` for Bloch Hamiltonian matrices