pablosanjose / Quantica.jl

Simulation of quantum systems on a lattice
Other
68 stars 8 forks source link

Self-consistent mean fields #229

Closed pablosanjose closed 6 months ago

pablosanjose commented 8 months ago

An important next step in Quantica is to start introducing the ability to model electronic interactions. In this issue we discuss possible designs to implement them at a mean-field level.

Introduction

We consider an interaction of the form $$\hat V = \sum_{ij} \hat n_i V(r_i-r_j) \hat n_j$$ for some interaction potential V(r) and where $$\hat ni=\sum{\alpha\beta}c^\dagger{i\alpha}K{\alpha\beta}c_{i\beta}$$ is some single-particle operator acting on site i (here α and β are orbital indices). Usually this is simply the electronic charge, so K=I.

Note: we actually assume a normal-ordered version of the above interaction when deriving the Hartree-Fock expressions below.

The simplest approach to describe this interaction is (self-consistent) mean field approximation, i.e. Hartree-Fock. This corresponds to introducing a single-particle self-energy $$\Sigma^{HF} = \Sigma^H + \Sigma^F$$ that is ω-independent, and where $$\Sigma^H{ij} = K \delta{ij}\sum_kV(r_k-ri)\mathrm{Tr}[K\rho{kk}] $$ $$\Sigma^F_{ij} = -V(r_i-rj) K \rho{ij}K $$ Here the trace is taken over orbital degrees of freedom, and the density matrix is defined as $$\rho_{ij} = \langle c^\dagger_jci\rangle$$ (note the switched site indices, and the fact that `ρ{ij}itself is a matrix over orbital indices in sitesiandj`).

The key non-trivial aspect of the above is that the self energy which enters the Green function, and which is therefore used to compute ρ itself depends on ρ. This is the self-consistent aspect of the mean field. It is typically solved by iteration with an initial guess until convergence.

What form should a mean field take in Quantica?

Since the Hatree-Fock Σ is in fact a self-energy, one's first instinct could be to introduce it in the same way in which we introduce embedding self energies from leads or other models, i.e. using attach on an AbstractHamiltonian. That could actually work except for a crucial limitation of embedding self-energies: they act by definition on a finite set of sites in a given system. However, the Hatree-Fock Σ acts on any pairs of sites ij for which V(r_i - r_j) is finite. If the system is itself finite, that is not a problem: just use attach to every site, without contraints. However, if the system is unbounded we cannot do this.

The underlying reason that embedding self-energies are assumed bounded is that they need to be composable with one another, which is easy to do when all of them are bounded: just use a T-matrix to dress the Green's function g without self-energies, $$G=g+gTg$$ where $$T = (1-\Sigma g)^{-1}\Sigma$$ has support only on the finite collection of sites with a self energy Σ.

If all self-energies are unbounded, they could also be composed: just do $$G = (ω-H-\Sigma)^{-1}$$ where Σ is the total lattice-periodic self energy. This Σ could even be ω-dependent for some specific solvers (e.g. Schur), but with other solvers (KPM, band solvers...) Σ needs to be ω-independent. Luckily, this is the case of Hartree-Fock

So the general strategy seems clear: we should not mix the mean-field self-energy with the embedding self-energies. The former should be viewed as a special type of (ω-independent) term in the Hamiltonian, and should hence be incorporated as a parametric model that depends on a parameter ρ.

Note: for bounded systems we do have the option, if we so wish, to add a Hartree-Fock (even a more sophisticated ω-dependent interaction self-energy) using attach on all sites. The general approach proposed below can also be specialized to do this, using in this case an attach(parametric_model) syntax.

Mean field as a parametric model

So, imagine we define a ParametricHamiltonian h with a mean field like this

gs(h) = greenfunction(h, solver = some_solver)[region = some_region]
ρs(h) = density(gs(h); kBT = 0, µ = 0)
mean_field = @onsite((r; ρ) -> HartreeFock(ρ,V,K)(r)) + @hopping((r, dr; ρ) -> HartreeFock(ρ,V,K)(r, dr), range = max_Fock_range)
h = hamiltonian(some_lattice, some_model + mean_field)

Then, h would have a parameter ρ, and we could perhaps do h0 = h(; ρ = 0) to get the Hamiltonian without the mean field, use it to obtain ρ0 = ρs(h0), and then do h1 = h(; ρ = ρ0) and so on, until convergence.

There is a crucial problem with the above idea: HatreeFock(ρ,V,K) cannot be defined as a function purely of r and dr, because ρ is actually a matrix over sites in some_region, not position. So HartreeFock should be a function of site indices, a possibility that we explicitly avoided when designing TightbindingModels. So now we see that we need to extend support to allow site indexing in models. This need was also anticipated when considering how to approach #4, so doing this now could ease addressing Wannier90 interfaces.

The proposed syntax for indexing in models is the following

mean_field = @onsite([i; ρ] -> HartreeFock(ρ,V,K)[i]) + @hopping([i,j,dn; ρ) -> HartreeFock(ρ,V,K)[i,j,dn])

The syntax [i; kw...] -> ... is not valid Julia syntax, but it is parseable, so it can be transformed by the @onsite and @hopping macros into a new type of AbstractParametricTerm which we may call IndexedOnsite and IndexedHopping, which wrap an index function (i; kw...) -> ... or (i,j,dn; kw...) -> ... instead. The difference with a normal ParametricOnsite and ParametricHopping is that when calling applyterms on them, we will evaluate the wrapped function with indices instead of positions.

One potential problem with the above is that the Hartree part of the mean_field model actually contains a sum over unit cells that should probably be precomputed for performance. This is perhaps not immediately obvious from the Σ^H definition. One should just realize that $$\mathrm{Tr}[K\rho{kk}]$$ is periodic over the lattice, i.e. it is the same inside each unit cell, so one can restrict the sum over k to sites within a single unit cell, as long as we replace V(rᵢ - rₖ) with $$W{ik} = \sum_n V(r_i - r_k - R_n)$$ where n is summed over all unit cells. But this is an optional optimization, that can be solved simply by adding another method like HatreeFock(ρ,V,W,K), and W could be computed with some function sum_cells(V, some_lattice) that returns perhaps a Matrix over the unit cell of some_lattice.

Another issue is what kind of object is ρ, obtained with the function density (which doesn't exist in Quantica yet). It cannot simply be a matrix over a LatticeSlice (which is what we currently get when we slice a GreenSolution over some sites for example), because it needs to know what is the actual LatticeSlice to be able to return the correct value (or throw a BoundsError) when indexed with ρ[i] or ρ[i,j,dn], since i, j are actual site indices in the unit cell (perhaps included in the LatticeSlice or perhaps not). So we need to define a new type of object, that wraps a matrix and a LatticeSlice, and probably some extra BlockStructure object to allow indexing multiorbital densities.

In fact, we should probably wrap also the output of slicing a GreenSolution in that new type, so that we can make better use of the result (which we currently cannot). Let's call this new type Observable. It would represents in general things of the form <P_s A P_s>, where P_s is a projector over some finite set of sites, and A is any Operator (bounded or unbounded, see #227). In particular we would have ρ::Observable and gs(ω)::Observable (the latter would be a breaking change, by the way). And that Observable type should implement indexing over sites and unit cells, returning views of the wrapped matrix, or throw a BoundsError (it should therefore also implement haskey or similar).

The details of the Observable implementation I leave for another post.

CC: @BacAmorim

pablosanjose commented 8 months ago

Small correction: since the mean field terms need both the site indices and site positions, we need something a bit more general, like

mean_field = @onsite([site; ρ] -> HartreeFock(ρ,V,K)[site]) + @hopping([site´, site; ρ) -> HartreeFock(ρ,V,K)[site´, site])

where site is a non-allocating struct that wraps the site index, position, sublattice and cell

BacAmorim commented 8 months ago

Thank you for the nice write up. Some thoughts I have on the proposal:

  1. I agree that an interacting self-energy (bulk self-energy) should be treated as a different type of object than a self-energy due to leads (embedding self-energies).
  2. We should leave room in the design to allow for frequency dependent bulk self-energies (as they appear in GW approximation). Should the frequency dependent bulk self-energy be treated on the same footing as the frequency independent case? Since the methods used to treat frequency-dependent self-energies (which are functions of the Green's function) are very different for the frequency independent case (which are functions of the density matrix), I think they should be treated seperatly.
  3. What kind of structure should a mean-field Hamiltonian be? While I see why treating it as a ParametricHamiltonian is appealling, I fear that in doing so one has to use a language that might be a bit weird. Case in point: I don't understand why the Fock mean-field should be a function of (r, dr). It has the same support of the density matrix, ρ[i, j] and can be computed without ever having to mention r or dr.
  4. In the same way that OpenHamiltonian exists, would it make sense to have MeanFieldHamiltonian as a type? Then we could have solver for this type of problem, that specify both: how (i) the mean-field is compute in each self-consistent step, (ii) how the self-consistent is uptated (mixing) between iterations. A MeanFieldHamiltonain would be a parent Hamiltonian + frequency independent bulk self-energies.
  5. What is the need of defining a general Observable object instead of defining only a DensityMatrix object? Observables can be obtained from the traces of DensityMatrices/GreensFunctions multiplied with Operators. What would be the use case for an Observable that cannot be covered by Hamiltonian, Operator, GF or DensityMatrix?
  6. In the example code, why is the density matrix computed from a Green's function? This seems like an unnecessary middle step.
  7. I think is is essencial to have an easy way to index Observables/Operators. This should also be well documented as it is essencial if a user wants to extend any of the functionality of Quantica (should we open an issue for this?).
  8. site is a non-allocating struct that wraps the site index, position, sublattice and cell Is this a LatticeSlice or is it something else?

pablosanjose commented 8 months ago

We should leave room in the design to allow for frequency dependent bulk self-energies (as they appear in GW approximation). Should the frequency dependent bulk self-energy be treated on the same footing as the frequency independent case? Since the methods used to treat frequency-dependent self-energies (which are functions of the Green's function) are very different for the frequency independent case (which are functions of the density matrix), I think they should be treated seperatly.

This is the centerpiece of this discussion: should we create a new structure that represents a totally general self-energy or should we treat the specific case of a Hartree-Fock mean field self-energy differently?

One would think it is advisable to generalize things as much as possible. However, there are a number of distinct cases that can be difficult to merge into a single representation, I think

  1. Embedding self-energies Σ(ω) from finite-size contacts, with support on a certain LatticeSlice (i.e. on a finite number of sites)
  2. Embedding self-energies Σ(ω) from infinite contacts (e.g. from a superconducting shell on an infinite nanowire), whose support cannot be represented by a LatticeSlice, but which is lattice-periodic like the Hamiltonian (for certain choice of unit cell)
  3. Like 2, but without any lattice periodicity
  4. Hartree-Fock self-energy with a finite-range cutoff (for the Fock term) - this is analogous to 2, but without an ω dependence.
  5. Like 4 but without a finite-range cutoff. Unless V(r) is itself finite-range, this case probably only makes sense for finite-size (bounded) systems
  6. More general Many-Body Σ(ω) acting everywhere, and with a more convoluted self-consistency condition than Hartree-Fock

So here we are interested in case 4. Currently we have support for case 1 via attach. As I mentioned, the bounded Σ(ω) of case 1 is amenable to a T-matrix solver, but the rest are not. So that already poses a challenge to a generalization. For unbounded self-energies we need to use specific Green solvers. We currently have Schur (for 1D), SparseLU (for 0D), KPM (for 0D) and Bands (for arbitrary D>0). A second problem becomes apparent when we consider the constraints of ω-dependent versus ω-independent self energies. Some Green solvers can deal with arbitrary ω dependence (such as Schur and SparseLU), while others (KPM and Bands) cannot. Hence, for case 4 of interest, we have all our solvers at our disposal to compute Green functions, while for case 6, we would only be able to deal with 0D and 1D systems.

I could go on, but it seems to me that trying too hard to unify all kinds of self-energies is probably not a good strategy. For the particular case of case 4, a ParametricHamiltonian approach as proposed here allows us to tap into all the GreenSolvers we may have, as long as the self-energy is made finite-range (i.e. if we stick to 4, but leave out 5).

The design space of a more general ManyBodyHamiltonian is a bit daunting for me at the moment. That would need to be thought out carefully and discussed. Perhaps we could arrive at the conclusion that a general unbounded many-body Σ(ω) just makes sense at the level of an equation-of-motion for the GreenFunction, so we don't even need to introduce the concept of unbounded Σ(ω). But I am not at all sure about this, and would very much appreciate your thoughts about my reasoning above. Do you still think we could have a completely general SelfEnergy construct?

What kind of structure should a mean-field Hamiltonian be? While I see why treating it as a ParametricHamiltonian is appealling, I fear that in doing so one has to use a language that might be a bit weird. Case in point: I don't understand why the Fock mean-field should be a function of (r, dr). It has the same support of the density matrix, ρ[i, j] and can be computed without ever having to mention r or dr.

If we think generally, we need to consider unbounded systems. Then Hartree-Fock probably only makes sense with a range cutoff, i.e. case 4, but not 5. If we have a Coulomb interaction without any cutoff, the Fock term in a metal would yield hoppings of infinite range. I don't know how to deal with that. So I actually constrain the Hartree-Fock problem to a Hartree-Fock-with-cutoff in my head. The cutoff, moreover, determines the degree of sparsity of the mean field Hamiltonian, so it is critical to choose a balance between performance and accuracy. For simple models, such as Hubbard, the cutoff is extreme, and it needn't even appear (we just have an onsite mean field term), but more generally, it must be provided.

In the same way that OpenHamiltonian exists, would it make sense to have MeanFieldHamiltonian as a type? Then we could have solver for this type of problem, that specify both: how (i) the mean-field is compute in each self-consistent step, (ii) how the self-consistent is uptated (mixing) between iterations. A MeanFieldHamiltonain would be a parent Hamiltonian + frequency independent bulk self-energies.

My vision here is to be able to mix and match as much functionality as we can. The current GreenSolvers are pretty powerful. The self-consistency step and the GreenSolver step (to compute the density matrix) are independent, so it makes sense to decouple them. Our GreenSolvers should be able to work with MeanFieldHamiltonian, however we define it (ideally, without rewriting them all again). That is precisely my rationale to make MeanFieldHamiltonian === ParametricHamiltonian. What requirements does MeanFieldHamiltonian have that are not met by ParametricHamiltonian, with ρ as a parameter?

The only tricky thing in the proposal is how to define the parametric terms and ρ itself to make ParametricHamiltonian be able to implement a Hartree-Fock self-energy.

What is the need of defining a general Observable object instead of defining only a DensityMatrix object? Observables can be obtained from the traces of DensityMatrices/GreensFunctions multiplied with Operators. What would be the use case for an Observable that cannot be covered by Hamiltonian, Operator, GF or DensityMatrix?

Maybe none. In fact, GreenFunction represents the retarded and advanced Keldysh components (depending on imag(ω)), while the density matrix is the lesser/greater component. And that exhausts everything. Even in a future real-time extension of Quantica.jl, we should probably build everything on top of the full Keldysh Green function, so Observable would not be needed.

What I was really trying to introduce with Observable is some kind of wrapper for the output of GreenFunction or density or current or any other observable computed over a finite LatticeSlice (which currently takes the form of a flat Matrix), bundled together with some metadata to enable indexing over a any given cell or site (information that is currently lost when calling, say, g(ω)[region = some_region]). So perhaps what I am really talking about is not an Observable, but rather an Indexable wrapper, or similar. Some x::Indesable that can then be used as x[cellvector, siteindex´, siteindex] -> view over wrapped flat matrix. Still thinking about that.

In the example code, why is the density matrix computed from a Green's function? This seems like an unnecessary middle step.

In equilibrium we can compute the density matrix as an ω integral of the spectral function, using the fluctuation-dissipation theorem. This is a fallback method that works for all GreenFunctions, regardless of the solver used. By integrating over the analytical continuation in the complex plane, the integral is rather efficient (we do that for the critical current already, using QuadGK). However, certain solvers could also implement other specialized methods, so that if we do density(gs::GreenSlice) for a gs with GreenSolvers.KPM, we could use the already computed momenta to get the density more efficiently. That would be an optimization.

I think is is essencial to have an easy way to index Observables/Operators. This should also be well documented as it is essencial if a user wants to extend any of the functionality of Quantica (should we open an issue for this?).

I fully agree. When designing a roadmap to mean fields in my head, the first step is to refactor the different BlockStructure objects so that we can build an Indexable or Observable, or whatever we call it, with fast indexing functionality. And only then, see how to make the ρ density matrix support that interface and use it as a mean field parameter.

Feel free to open a separate issue for that to discuss the implementation of Indexable. Otherwise we can do that here.

site is a non-allocating struct that wraps the site index, position, sublattice and cell Is this a LatticeSlice or is it something else?

No, it would be a new, very simple type, that the user would not need to worry about:

struct Site{T,E,L}
    r::SVector{E,T}     # position
    i::Int              # site index
    n::SVector{L,Int}   # cell
    s::Int              # sublat
end

The user would only interact with Site when defining IndexedOnsiteTerms or IndexedHoppingTerms, such as @hopping([s´, s; ρ, K = I] -> -K * ρ[s´.i, s.i] * K * V(s´.r - s.r)) for the Fock term.

BacAmorim commented 8 months ago

In order of relevance:

What I was really trying to introduce with Observable is some kind of wrapper for the output of GreenFunction or density or current or any other observable computed over a finite LatticeSlice (which currently takes the form of a flat Matrix), bundled together with some metadata to enable indexing over a any given cell or site (information that is currently lost when calling, say, g(ω)[region = some_region]). So perhaps what I am really talking about is not an Observable, but rather an Indexable wrapper, or similar. Some x::Indesable that can then be used as x[cellvector, siteindex´, siteindex] -> view over wrapped flat matrix. Still thinking about that.

I completelly agree with this. I think that the most important thing is to be able to use the indexing machenery of Hamiltonian with general matrices. If we have a general matrix A (or a series of harmonic matrices An representing intracell and intercell matrix elements) with dimensions compatible with the Quantica Hamiltonian, h, one should be able to do something like:

Aop = operator(h, A)
Aop = operator(h, (d0, A0), (d1, A1)...)

Such that the Aop can be indexed in the same way a Quantica Hamiltonian is index. If this can be made to work with a Lattice instead of an Hamiltonian, then this approach would also solve the problem of convert a Hamiltonian from Wannier90 calculation to Quantica.

I fully agree. When designing a roadmap to mean fields in my head, the first step is to refactor the different BlockStructure objects so that we can build an Indexable or Observable, or whatever we call it, with fast indexing functionality. And only then, see how to make the ρ density matrix support that interface and use it as a mean field parameter.

Completelly agree! I think this is really important to enable extensions of Quantica by users. And once we have this the mean-field part becomes trivial.

  1. I agree with you that for now we should focus on mean-field / frequency independent bulk self-energies has this is a correction to a single particle Hamiltonian. For the frequency dependent part one has to deal with Green's functions, Therefore, there is no benefit to treat them on a equal footing.

  2. Regarding unbounded operators / self-energies. I assume that by unbounded you mean translationally invariant infinite systems. For these you can use Bloch's theorem, where you don't need a cutoff in real-space (you need a cutoff in momentum space). For example: for the density matrix of a translationally invariant system, it is more natural to compute ρ(k) where k is a Bloch momentum. Naturally you can Fourier transform it to real space to obtain ρ(R) where R is a lattice vector (this is what Rui does). ρ(R) has the same structure as a Quantica Hamiltonian: ρ(R) for each R is a harmonic of the density matrix. The issue is that for a metal at T=0 ρ(R) is not short ranged. On the other hand ρ(k) has the same structure of a Bloch Hamiltonian. In some cases it might make sense to work with ρ(k), while in others it makes sense to work with ρ(k).

  3. The previous points leads to the question: does it make sense to include self-consistent calculations as part of the Quantica core? In some contexts one wants to take advantage of Bloch's theorem while in other cases Bloch's theorem might actually complicate things. Provided Quantica provides efficient indexing of general operators, then the user can easilly use that for its own applciation / subpackage. Do you really want to have self-consistent calculations, time-dependent mean field, keldysh, etc all backed in into Quantica? Or should Quantica be a base for a broader ecosystem built on top of it (similar to SciML)?

In equilibrium we can compute the density matrix as an ω integral of the spectral function, using the fluctuation-dissipation theorem. This is a fallback method that works for all GreenFunctions, regardless of the solver used. By integrating over the analytical continuation in the complex plane, the integral is rather efficient (we do that for the critical current already, using QuadGK).

I know that. But doing so is not very efficient. In that way what you are doing is first computing the spectral function (imag part of GF) and then integrating it over omega. The best way of computing the integral over chebyshev polynomials is via a FFT, but that is equivalent to just compute the Chebyshev series of the Fermi function using ApproxFun.jl of FastTransforms.jl (which use FFTW). So the Green's function step is unnecessary.

pablosanjose commented 8 months ago

I think that the most important thing is to be able to use the indexing machenery of Hamiltonian with general matrices.

Yes, let's focus first on this. I'll open a separate issue about it.

EDIT: see #230

If this can be made to work with a Lattice instead of an Hamiltonian, then this approach would also solve the problem of convert a Hamiltonian from Wannier90 calculation to Quantica.

Extending operator to work just as hamiltonian but yielding an Operator is trivial, and we should probably do it, so that op = operator(lat, model, orbitals = 2) works. Is this what you have in mind?

Regarding unbounded operators / self-energies... For these you can use Bloch's theorem, where you don't need a cutoff in real-space (you need a cutoff in momentum space)

I think we both understand each other, probably. What I mean here is that in an unbounded, translationally invariant system, you have Bloch's theorem, yes. But you can still have hoppings that are infinite-range (e.g. Fock term in a metal with Coulomb potential, without a cutoff), or finite range (e.g. Fock term in an extended Hubbard model, which is a nearest neighbor mean-field hopping). In the former case, ρ(k) has an unbounded number of harmonics, because ρ(R) has in principle no cutoff. So you need to fix one.

... we should focus on mean-field / frequency independent bulk self-energies has this is a correction to a single particle Hamiltonian.

Yes, I agree. This functionality at least should be part of base Quantica.

The previous points leads to the question: does it make sense to include self-consistent calculations as part of the Quantica core?

This is a very valid concern. How far should we extend Quantica, and when should we just write a new downstream package? I believe the current feature-set of Quantica should more or less be its full scope (modulo adding new solvers, fixing bugs, or polishing/enhancing functionality that is currently present). In other words, in principle I'm considering here just those features that can be squeezed out from the current design with only minor generalizations. Hence, I'm hesitant to introduce new structures representing general many-body self-energies, but I'm more favorable to self-energies that can fit into the current ParametricHamiltonian paradigm.

The thing is that once the proposed representation of ρ is in place, iteration till self-consistency is such a minor addition that it might make sense to add it. It would probably be a very simple fixedpoint function that works just like foldl but without a pre-fixed number of iterations, but rather a convergence criterion.

Things beyond this, such as a real-time Quantica integrator, is probably better as a subpackage.

I know that. But doing so is not very efficient. In that way what you are doing is first computing the spectral function (imag part of GF) and then integrating it over omega. The best way of computing the integral over chebyshev polynomials is via a FFT, but that is equivalent to just compute the Chebyshev series of the Fermi function using ApproxFun.jl of FastTransforms.jl (which use FFTW). So the Green's function step is unnecessary.

Absolutely. I didn't mean to have ρ computed always as a numerical integral, only as a general fallback method that works for any GreenFunction (any solver). Of course, if your solver of choice is KPM, density should compute it as you say, which is much faster.