grimme-lab / dxtb

Efficient And Fully Differentiable Extended Tight-Binding
https://dxtb.readthedocs.io
Apache License 2.0
65 stars 10 forks source link

Performance Improvements #110

Closed marvinfriede closed 1 year ago

marvinfriede commented 1 year ago

Some problems and potential code pieces that can be improved were already found:

marvinfriede commented 1 year ago

I tried to use torch.nn.functional.pad but did not find a way to circumvent the costly loop, because the padding function only takes tuples and cannot be vectorized either.

Code Current implementation (this is the slow loop): ```python for r, pair in enumerate(upairs): ovlp[ pair[0] : pair[0] + norbi, pair[1] : pair[1] + norbj, ] = stmp[r] return ovlp ``` Version with padding instead of indexing, which is much slower as it does not remove the loop: ```python l = ovlp.shape[0] padlist = ( upairs[:, 1], l - upairs[:, 1] - norbj, upairs[:, 0], l - upairs[:, 0] - norbi, ) for r in range(upairs.shape[0]): pad = ( padlist[0][r], padlist[1][r], padlist[2][r], padlist[3][r], ) ovlp += torch.nn.functional.pad( stmp[r], pad, "constant", 0.0, ) return ovlp ```
hoelzerC commented 1 year ago

loop over primitives in overlap gradient (https://github.com/grimme-lab/xtbML/pull/108#discussion_r1126274610) loop over position vector in overlap gradient (https://github.com/grimme-lab/xtbML/pull/108#discussion_r1126281350)

Done [1].