LDOS mixing for non-monotonic smearings

mfherbst commented 1 year ago

There is an issue with the LDOS mixing for non-monotonic smearing functions, e.g.

using DFTK

a = 5.42352  # Bohr
lattice = a / 2 * [[-1  1  1];
                   [ 1 -1  1];
                   [ 1  1 -1]]
atoms     = [ElementPsp(:Fe, psp=load_psp("hgh/lda/Fe-q8.hgh"))]
positions = [zeros(3)];

kgrid = [5, 5, 5]
Ecut = 10
magnetic_moments = [4];

model = model_LDA(lattice, atoms, positions;
                  magnetic_moments,
                  temperature=5e-4,
                  smearing=Smearing.MarzariVanderbilt())
@show model.n_electrons

basis = PlaneWaveBasis(model; Ecut, kgrid)
ρ0 = guess_density(basis, magnetic_moments)
scfres = self_consistent_field(basis, is_converged=DFTK.ScfConvergenceDensity(1e-8);
                               ρ=ρ0, mixing=LdosMixing())

Also for Methfessel-Paxton. Might be due to the temperature fiddeling we do in the mixing, which does not play out so well here. Replacing the smearing in ldos by gaussian gives a performance penalty over kerker, but at least the scf converges.

antoine-levitt commented 1 year ago

Is it that the LDOS gets negative? We can just filter negative values?

mfherbst commented 1 year ago

Can't be. MarzariVanderbilt does not get negative. Ah but the derivative can ... that's a good point. I'll check that.

mfherbst commented 1 year ago

It does not get negative, but the derivative can have pretty large values. My current suspicion is that this screws things up in particular when the orbitals are not yet super accurate either. In that case increasing the temperature should help though.

In any case one "solution" seems to be to use Gaussian smearing in this case for the LDOS and ignore the actual smearing decided by the user. Maybe we should just do that.

azadoks commented 1 year ago

As a side note on MV smearing, I was just doing a Fermi-level post-processing to the MC3D database (~30k MV-smeared QE calculations), and I ran into a bunch of cases where the DFTK bisection Fermi solver gives a bad Fermi level; here's an example:

I did a quick and dirty implementation of the QE method: trying Newton's method starting from the bisection Fermi level found using Gaussian smearing. This fixes up most of the problem cases, but it's not super satisfactory.

I'm not sure how much the Fermi level plays into the LDOS mixing, but I thought I'd mention it in any case now that it's on my mind.

mfherbst commented 1 year ago

I did a quick and dirty implementation of the QE method: trying Newton's method starting from the bisection Fermi level found using Gaussian smearing. This fixes up most of the problem cases, but it's not super satisfactory.

I'm not fully convinced by that because the occupation derivative can become quite large. I have a student looking into this more systematically right now ... first trying to find the multiple levels and then addressing the question of which one to pick.

mfherbst commented 1 year ago

I ran into a bunch of cases where the DFTK bisection Fermi solver gives a bad Fermi level; here's an example:

Yes we found those too. We really should do something about this.

antoine-levitt commented 1 year ago

That's super interesting! Technically I like the DFTK zero more than the QE one, because the function is linear at the DFTK zero but appears quadratic at the QE one. This is better because it means that it won't get destroyed by small perturbations. I wonder in which sense the QE zero is better...

azadoks commented 1 year ago

I wonder in which sense the QE zero is better...

In the sense that it's in better agreement with Gaussian smearing / it's actually the physical zero. With Gaussian smearing (same material, temperature):

Particularly important because the zero changes the character; here are the bands with the DFTK MV Fermi level:

antoine-levitt commented 1 year ago

Thanks for the very detailed plots, super helpful!

OK so this is nontrivial: in a semiconductor (as here) we want the "entropic" zero, ie the one where n(eps) is flat and constant, whereas in a metal we would probably prefer one where the density of states is large? Not sure how to encode this, though. In particular, the only 100% robust method (bisection) requires bracketing, which is hard to achieve here...

Maybe we can try the "semiconductor" filling first (constant number of states per k point), and if that's adequate we just do it, even at finite temperature? CC @gkemlin who added this in the case of 0 temperature.

gkemlin commented 1 year ago

Well, that's actually strange to find this Fermi level for a semiconductor... Are we sure about the uniqueness of the zero of the function we pass to the bisection with MV smearing ?

What I implemented for zero temperature was just to take the mid-point between HOMO and LUMO and use the bisection only for finite temperature.

mfherbst commented 1 year ago

it's actually the physical zero.

But how do you even judge that? I mean in the insulator case you sketch I agree with you. But as Antoine said I fail to see an easy way to encode this programmatically, including in particular metals. I guess the DOS, respectively occupation derivative should be a pretty good indicator ... somehow.

Also I'm not sure about a simple Newton minimisation. For MV this probably works (because its asymmetric about the Gaussian), but for MP-type smearings: Can't it happen you go the wrong way ... especially in metals.

In particular, the only 100% robust method (bisection) requires bracketing, which is hard to achieve here...

Actually it's not all that bad. I am experimenting with my student with IntervalRootFinding and Roots.find_zeros at the moment (similarly focused on starting from a guess based on Gaussian mixing) and I think we should be able to get all the roots reasonably rigorous around a guess Fermi level. Then the remaining question is of course which one to pick ...

Are we sure about the uniqueness of the zero of the function we pass to the bisection with MV smearing ?

No it's not unique for MV for non-zero temperature, that's exactly the problem :smile:.

mfherbst commented 1 year ago

Here is an overview on one iron case we found (only 4 k-points either side so arguably not the most realistic):

overview

Dotted and dashed lines are the roots. The dashed roots are the ones found by Gaussian followed by Newton on the actual smearing. So Newton does not necessarily pick the closest in all cases and sometimes actually even fails to converge. Also the derivatives at the roots (in general) can be all over the place it seems.

azadoks commented 1 year ago

But how do you even judge that?

I agree that this is in general a very tough question to answer, and I don't know what would be best or even good to do here.

I mean in the insulator case you sketch I agree with you.

Just to be clear, this isn't a sketch; it's a real material that I found in the MC3D (so ideally, it should be an experimentally known material).

So Newton does not necessarily pick the closest in all cases and sometimes actually even fails to converge.

This I can back up with some experience on the big database of more realistic calculations. At least with my implementation (which I'm not overly confident in), "sometimes ... fails" is probably a bit of an understatement; it fails quite often (I use sqrt(eps(T)) as the tolerance for Roots.find_zeros).

antoine-levitt commented 1 year ago

Sooo... don't use smearing at all? (at least non-monotonic smearing) We should implement some kind of interpolation method (ala tetrahedron) at some point anyway.

azadoks commented 1 year ago

Sooo... don't use smearing at all?

I'm not sure this is a very practical solution in general. MV smearing, as I understand it, gives much better zero-temperature-like (hence the name "cold") forces even at not-insignificant smearing temperatures. This is good for efficient structure optimization of metals, and particularly important in the context of high-throughput or strongly out-of-equilibrium calculations where the electronic character of the material isn't known.

(at least non-monotonic smearing)

In one case that we found in the common workflows verification work, the monotonic smearing methods all gave two lattice parameter minima (approx. a factor of 2 apart) for an (exotic) oxide, which really messed up the equation of state. The only tested smearing method that didn't give two minima (at any smearing temperature) was MV. Despite its quirks, MV smearing does still solve non-trivial shortcomings of the monotonic methods.

We should implement some kind of interpolation method (ala tetrahedron) at some point anyway.

In any case this would be very useful, especially when the material is known to be insulating. However, I don't think it can replace smearing methods in general.

antoine-levitt commented 1 year ago

In one case that we found in the common workflows verification work, the monotonic smearing methods all gave two lattice parameter minima (approx. a factor of 2 apart) for an (exotic) oxide, which really messed up the equation of state. The only tested smearing method that didn't give two minima (at any smearing temperature) was MV. Despite its quirks, MV smearing does still solve non-trivial shortcomings of the monotonic methods.

Isn't this just due to too high temperature / insufficient kpoint sampling?

In any case this would be very useful, especially when the material is known to be insulating. However, I don't think it can replace smearing methods in general.

There's nothing to do for insulating materials, as the non-interpolated, non-smeared method is already pretty efficient. There are conflicting reports of what's better (tetrahedron method, various smearing schemes) in the literature, this is all very confusing, but it seems to me that the tetrahedron scheme should be more robust and easier to work with (only one parameter to tune, the kpoint sampling, instead of the temperature)?

azadoks commented 1 year ago

Isn't this just due to too high temperature / insufficient kpoint sampling?

In a sense yes, when the sampling is made denser and the temperature is decreased, they all agree. More practically, the sampling was already much denser than what's normally done (~0.07 Å^-1) when the problem was found. I'd have to check with a colleague, but if I remember right, the problem was only fixed for monotonic smearing at really dense k-point sampling (~0.04 Å^-1, I'd guess).

it seems to me that the tetrahedron scheme should be more robust and easier to work with

Yeah, I agree: it's probably both of these. The tradeoff though is that for the same k-point density, tetrahedron integration is already much more expensive than smearing, saying nothing about the required density to achieve the same convergence on, e.g., the forces.

antoine-levitt commented 1 year ago

for the same k-point density, tetrahedron integration is already much more expensive than smearing

Really? Why? Surely the additional cost of performing the tetrahedron interpolation is negligible?

azadoks commented 1 year ago

I'm not entirely sure, but in my experience with DOS post-processing using QE's dos.x, using the tetrahedron method took at least an order of magnitude more time than smearing.

antoine-levitt commented 1 year ago

Ah, for the DOS, I thought you meant for ground state. Yeah for the DOS I can see it taking much more time.

antoine-levitt commented 1 year ago

https://arxiv.org/pdf/2212.07988.pdf (haven't actually read it)

mfherbst commented 1 year ago

The source of the bug is the way be determine the "unoccupied" orbitals. It's fixed in #792.

JuliaMolSim / DFTK.jl

LDOS mixing for non-monotonic smearings #787