BTD Inversion - Githubissues

AleksBL commented 3 years ago

Hi, Have worked a bit on the BTD inversion. One problem there is right now is the for-loops over energy in the NEGF code, which makes the code i have a bit hard to implement in your code. I worked a bit on a workaround, but it didnt work out right now.

In the attached file is a really barebones implementation for a BTD inversion (iG_dropin.py, the use of which is demonstrated in the bottom of the Test_Ribbon.py script)... for the example on a 1440 orbital Hamiltonian it is twice as fast as the scipy implementation. There is a lot of room to make it faster, and the secondlast block in the Test_Ribbon.py script hopefully demonstrates this, but we need to make the inversions in one go instead as this would remove a lot of overhead converting between sparse , block matrix and numpy formats.... :) The code can approach (but still a bit slower than) tbtrans speeds if larger batches of energy-points are used.

The Test_Ribbon script demonstrates the uses, and has an outcommented part where the DFT calculation is made. Send.tar.gz

BR. Aleksander

zerothi commented 3 years ago

FYI did you see the btd in sisl/toolbox/btd? If your code enhances that one it could be of general use? ;-)

sofiasanz commented 3 years ago

Hi Aleksander, thank you for the work! :)

I see. I will take a look at the code. Probably is not so hard to avoid the energy loops by inverting the multidimensional array at once. I think that if we use numpy.linalg.inv it can invert this type of matrix.

BTW I have seen that you import some python modules, such as seek path that obtains band paths in the Brillouin zone of crystal structures. Maybe this could be done with sisl? Just to avoid having to import too many modules.

AleksBL commented 3 years ago

Yes, the imports can be cut down to numpy, scipy, sisl and optionally numba, a lot of the other imported stuff is superfluous for this. :)

sofiasanz commented 2 years ago

Hi @AleksBL! I made a few changes in #105 following your comment (https://github.com/dipc-cc/hubbard/issues/101#issue-947058764). There I eliminated the energy for loops to obtain the Green's function per energy point, instead the Green's function now is obtained in one go for all energy points. Is something like this what you expected to have in order to implement your BTD code?

AleksBL commented 2 years ago

Yes, this is more compatible! I have a working _G right now, but i also want it to use the fact that the greens function is a symmetric matrix(and only calculate half the matrix elements), so ill just send you the code soon'ish... :)

AleksBL commented 2 years ago

Take a look at the "Calculation.py in this one.... Cut out a lot of crap from the previous block matrix code, so hopefully the "block_linalg" code is better. :) There might be some functions that arent used for this that dont work. Send2.zip

edit to elaborate: for the code to work together with your code, it only really needs to get the pivotindices from somewhere(eg. a tbtrans calculation together with the electrode indices and electrode self energies...). There are some modes it can run in: Full and DOS. The DOS mode only calculates the diagonal of the greens function and is about twice as fast compared to Full in this example.

AleksBL commented 2 years ago

@sofiasanz by the way, is the elec_idx arrays supposed to have the (1,n) shape, or is it me who have gotten the wrong impression at some point? :)

zerothi commented 2 years ago

@sofiasanz by the way, is the elec_idx arrays supposed to have the (1,n) shape, or is it me who have gotten the wrong impression at some point? :)

The usefulness of indices with that dimensionality can reduce redundant indices, in your code you have lots of:

pvt = pvt.ravel() # just to make it clear it is 1D
G = G[pvt, :][:, pvt]

however, this can be done in a single step with:

pvt = pvt.reshape(-1, 1)
G = G[pvt, pvt.T]

since it becomes b-casted to correct indices.

AleksBL commented 2 years ago

@zerothi I just think it introduces a hidden transposition of the self-energies if it is indeed (1,n)? if it was (n,1) i think it would be fine :) its very inconsequencial though because the selfenergies are symmetric...

zerothi commented 2 years ago

@zerothi I just think it introduces a hidden transposition of the self-energies if it is indeed (1,n)? if it was (n,1) i think it would be fine :) its very inconsequencial though because the selfenergies are symmetric...

true, but generally it should be (n,1) I don't know about the (1,n) shape. But self-energies are not symmetric @ k.

AleksBL commented 2 years ago

I think its me who have made this mistake in the elec_idx to start with... :)

AleksBL commented 2 years ago

Hi, Just to follow up on the BTD code; Whats the biggest system that have been treated using this method your code implements? And should we see if we can beat it? I also think there's a fairly easy fix for the pivotting problem we encountered. We can just make an additional piece of code that expands the blocks where the tbt-trans self-energies lives to also just include the electrode indices....

\ Aleksander

sofiasanz commented 2 years ago

Hi @AleksBL, sorry for the delay, I had been a little busy :)

I would say that the largest system that we have treated with is the device formed of crossed GNRs (around 900 atoms). And that I can run it locally in my computer with the brute force inversion methodology.

For the electrode indices, nice! Do you need anything from my side? We can also use the pivoting indices from TBT and fold them into the device as TBT does, but only in the case of BTD inversion.

Sofia

AleksBL commented 2 years ago

This seems to work:

def get_btd_partition_of_matrix(_csr, start):
    from scipy.sparse.csgraph import reverse_cuthill_mckee as rcm
    from scipy import sparse as sp
    assert _csr.shape[0] == _csr.shape[1]
    p = rcm(_csr)
    no = _csr.shape[0]
    csr = _csr[p,:][:,p]
    Part = [0, start]
    x0   = Part[-1]
    while x0<no:
        #print(x0)
        sl = slice(Part[-2], Part[-1])
        coup = csr[sl,sl.stop:]
        i,j,v = sp.find(coup)
        if len(i)==0:
            break
        else:
            x0 = np.max(j)+1
            Part += [Part[-1] + x0]
    btd = [Part[i+1] - Part[i] for i in range(len(Part)-1)]
    return p, btd, Part

Test:

import sisl
#from siesta_python.funcs import get_btd_partition_of_matrix
import numpy as np
import scipy.sparse as sp
import matplotlib.pyplot as plt
#from Block_matrices.Block_matrices import test_partition_2d_sparse_matrix as testP

g = sisl.geom.graphene(orthogonal = True)
H = sisl.Hamiltonian(g)
H.construct([[0.1, 1.5],[0.0, -2.7]])
H = H.tile(20,1).tile(50,0)
H.set_nsc([1,3,1])
i = np.arange(H.no)
np.random.shuffle(i)
i = i[0:H.no-10]
H = H.sub(i)
h = H.Hk()
mat = sp.csr_matrix((H.no,H.no))
mat[(abs(h)>1e-5)] = 1.0

piv,btd, part= get_btd_partition_of_matrix(mat, 20)

newh = h[piv,:][:,piv]
#f, S = testP(newh, part)
e1,v1 = sp.linalg.eigsh(h)
e2,v2 = sp.linalg.eigsh(newh)
assert  np.allclose(e1,e2)

@zerothi

AleksBL commented 1 year ago

Hi again, scipy turned out to have this function called "reverse_cuthill_mckee" which gives a pivotting scheme for the device without throwing away the electrode indices. Just to show some of the results from the Test_BTD3.py script: geometry: with BTD: without:

It seems to give almost the same as the regular way, but there is this small descrepancy on the decimals. I dont know if its a float32 vs float64 artifact or if they actually calculate something different.

Larger geometry, 4500 sites, very narrow and long which is the kind of structures the BTD will perform well on. RAM usage is ~5GB max and takes ~20min.

For a ~1100 orbital system similar to the two above, the BTD code is about ~20 times faster. Each step with the regular np.linalg.inv takes what the whole calculation takes with the BTD algorithm

dipc-cc / hubbard

BTD Inversion #101