Further speed up to _subtract_self_energies

chrisiacovella commented 4 months ago

Further speed up to _subtract_self_energies in DataModule by directly indexing into the atomic_numbers tensor to avoid additional overhead. Reduces

As further described in Issue #120 , the overhead of __getitem__ seems to be an additional factor. We only are using the atomic numbers, so all the other data returned when indexing into the dataset are not needing and just slowing things down. I just directly indexed into the tensor to get the relevant atomic numbers, and that seems to give some further substantial speed ups (reducing cost about ~50%). Update: Overlooked that I was using the getter to retrieve energy. Speed up is even better (see timings below).

This will additionally address issue #124, swapping in np.cumsum for torch._utils._accumulate in the record splitting routine.

Status

[ ] Ready to go

chrisiacovella commented 4 months ago

I realized I was still using the getter function for fetching the energy. Switching this to direct access further reduced the time. SPICE2 is now 54 seconds, ANI2X, 4:36.

chrisiacovella commented 4 months ago

@wiederm Just noticed that you had started to make some of the same changes in #122.

choderalab / modelforge

Further speed up to _subtract_self_energies #123

Status