mnt: avoid using energy loops to obtain the Density matrix

sofiasanz commented 3 years ago

In relation to #101: with these changes now we avoid using energy loops to obtain the spin densities in the NEGF class, instead the Green's function now is a numpy.ndarray that has the dimension of (len(energy_grid), no, no).

zerothi commented 3 years ago

Just a small comment.

Be careful about this, if you end up with 100 energy points and a fairly large matrix, you could quickly reach a very high memory requirement, so the possibility of splitting up energy points might be necessary?

Especially now since you have both equilibrium + non-equilibrium Green's function + the self-energies...

EDIT: Ok, only equilibrium + self-energies or non-equilibrium + self-energies, even though the problem might still persist.

sofiasanz commented 3 years ago

Hi Nick, thank you for the comment, as always!

Be careful about this, if you end up with 100 energy points and a fairly large matrix, you could quickly reach a very high memory requirement, so the possibility of splitting up energy points might be necessary?

I see. I did this pull request so we could have implemented the BTD code proposed in #101. The thing is that we need to do the inversion in one go so we can avoid converting between sparse, block matrix and numpy formats for each energy point (quoting https://github.com/dipc-cc/hubbard/issues/101#issue-947058764).

On the other hand the BTD implementation (as far as I understood) is not always the fastest route. I guess that we can have two possibilities where: if the BTD is the preferred route then we can use this implementation, or if in the other hand the brute force inversion of the matrix is the preferred one then have it as before?

zerothi commented 3 years ago

Hi Nick, thank you for the comment, as always!

Be careful about this, if you end up with 100 energy points and a fairly large matrix, you could quickly reach a very high memory requirement, so the possibility of splitting up energy points might be necessary?

I see. I did this pull request so we could have implemented the BTD code proposed in #101. The thing is that we need to do the inversion in one go so we can avoid converting between sparse, block matrix and numpy formats for each energy point (quoting #101 (comment)).

On the other hand the BTD implementation (as far as I understood) is not always the fastest route. I guess that we can have two possibilities where: if the BTD is the preferred route then we can use this implementation, or if in the other hand the brute force inversion of the matrix is the preferred one then have it as before?

Perhaps an easier approach would be to do the BTD in "batches", then the batch size could be in ram size (GB) or number of energy points, as you prefer. However, I think it would be nice if users can get feedback on how much memory it would consume for production runs? (Just an idea since this can explode easily).

I think the BTD would be "slower" for small systems due to the overhead in python. However, that should be quite small (Aleksander could fill in here how much it is ;))

AleksBL commented 2 years ago

I might be wrong, but it seems the only place the full matrix of the greens function is used is for the calculation of the spectral function? And this only seems to me to be calculated at E_F. We can just make the program only return the diagonal of the greens function when we only need the DOS, and save a lot of ram for all unused entries on the off-diagonal? (the code already has a function for this)

sofiasanz commented 2 years ago

I might be wrong, but it seems the only place the full matrix of the greens function is used is for the calculation of the spectral function? And this only seems to me to be calculated at E_F. We can just make the program only return the diagonal of the greens function when we only need the DOS, and save a lot of ram for all unused entries on the off-diagonal? (the code already has a function for this)

Yes! Sorry for the delay in my answer. I have just changed this PR in order to use only the diagonal of the Green's function (also following a bit your notation). The only place were we need the full Green's function is to calculate the non-equilibrium correction integrals.

AleksBL commented 2 years ago

I see only the diagonal of the spectral function is used... I think we can get away with only calculating the extremal columns and rows of the greens function?

sofiasanz commented 2 years ago

I see only the diagonal of the spectral function is used... I think we can get away with only calculating the extremal columns and rows of the greens function?

The diagonal of the Green's function is needed to obtain the Density matrix for the equilibrium part, but to obtain the non-equilibrium integrals we need the spectral function that is calculated as A = G \Gamma G^{\dagger}, where G is the Green's function and \Gamma is the broadening matrix. Since Gamma has the size of the electrodes we only need the part of the Green's matrix corresponding to the electrodes' atomic indices and not the full matrix, that's true.

AleksBL commented 2 years ago

I think one of the "np.dot"'s in the spectral function calculation can be thrown away and replaced by a elementwise multiplication and a sum A_ii = (G.*(Gamma@ (G.T)).T).sum(axis = 1)

Im being a bit nitpicky about this, sorry ;) i dont know if itll be significant in the end.

zerothi commented 2 years ago

Einsum would be a good candidate here. But benchmarking it would be nice ;)

AleksBL commented 2 years ago

import numpy as np

M = np.random.random((4000,200)) gam = np.random.random((200,200))

r1 = np.diag(np.dot(M, np.dot(gam, M.T))) r2 = (M*(np.dot(gam, (M.T))).T).sum(axis = 1) r3 = np.einsum('ik, kl, li -> i', M,gam,M.T, optimize = 'greedy')

sofiasanz commented 2 years ago

I think one of the "np.dot"'s in the spectral function calculation can be thrown away and replaced by a elementwise multiplication and a sum A_ii = (G.*(Gamma@ (G.T)).T).sum(axis = 1)

Im being a bit nitpicky about this, sorry ;) i dont know if itll be significant in the end.

Yes, sorry! You're completely right. In the end we need only the diagonal of the spectral function and we could return these quantities instead. I've changed this in commit 5756adfc0dd611acd803c38f28cf23a5c41442f2.

import numpy as np

M = np.random.random((4000,200)) gam = np.random.random((200,200))

r1 = np.diag(np.dot(M, np.dot(gam, M.T))) r2 = (M*(np.dot(gam, (M.T))).T).sum(axis = 1) r3 = np.einsum('ik, kl, li -> i', M,gam,M.T, optimize = 'greedy')

Indeed I get these times for each operation from above:

r1 -> 0.06980013847351074
r2 -> 0.006043910980224609
r3 -> 0.004599571228027344

So it looks like the last one is the best one! Thank you for benchmarking it :)

AleksBL commented 2 years ago

To follow up on

I see only the diagonal of the spectral function is used... I think we can get away with only calculating the extremal columns and rows of the greens function?

The diagonal of the Green's function is needed to obtain the Density matrix for the equilibrium part, but to obtain the non-equilibrium integrals we need the spectral function that is calculated as A = G \Gamma G^{\dagger}, where G is the Green's function and \Gamma is the broadening matrix. Since Gamma has the size of the electrodes we only need the part of the Green's matrix corresponding to the electrodes' atomic indices and not the full matrix, that's true.

I guess the electrode indices are always sorted into the first and last columns when tbtrans is run? Or are there some exceptions here @zerothi ?

zerothi commented 2 years ago

I guess the electrode indices are always sorted into the first and last columns when tbtrans is run? Or are there some exceptions here @zerothi ?

The electrodes may be located in any blocks (and quite often does). The only thing you'll know for sure is that they maximally span 2 blocks.

zerothi commented 2 years ago

@AleksBL you can see in my toolbox/btd/_btd.py how the spectral function can be calculated in two different ways.

AleksBL commented 2 years ago

I will take a look :)

dipc-cc / hubbard

mnt: avoid using energy loops to obtain the Density matrix #105