PattanaikL / GeoMol

MIT License
154 stars 43 forks source link

Add memoization of dihedral_pairs instead of computing them each iteration #1

Closed HannesStark closed 3 years ago

HannesStark commented 3 years ago

Add memoization of dihedral_pairs in datasets such that they are only computed in the first epoch and then stored in memory and reused. This should speed up the code since computing the dihedral pairs previously took up 73% of the runtime in my experiments. Now, this overhead will only happen in the first epoch, and the additional memory usage is negligible.

Calling the attribute of the PyTorch geometric Data object edge_index_dihedral_pairs has the dihedral_pairs being treated as edge indices during batching such that PyTorch geometric automatically takes care of increasing the indices of the dihedral_pairs according to the graph sizes when creating a batch.

HannesStark commented 3 years ago

Training on Drugs seems to be around 4 times faster with the "fixed" version reaching epoch 100 when the previous version reaches epoch 27 🤗