aiqm / torchani

Accurate Neural Network Potential on PyTorch
https://aiqm.github.io/torchani/
MIT License
460 stars 127 forks source link

Fix handwritten cosine similarity and add test for NaNs in AEVComputer #561

Closed IgnacioJPickering closed 3 years ago

IgnacioJPickering commented 3 years ago

Handwritten cosine similarity outputs NaN values for superimposed coordinates due to the singularity in v1 dot v2 / ||v1|| * ||v2|| when one of the vectors has zero distance.

Sometimes, during an MD simulation with many many many atoms, 2 atoms are randomly pushed to the same coordinate value. When this happens torchani outputs NaN as an energy and the simulation breaks.

With clamp:

torchani.aev.cutoff_cosine - 0.2s
   torchani.aev.radial_terms - 0.3s
   torchani.aev.angular_terms - 0.8s
   torchani.aev.compute_shifts - 0.0s
   torchani.aev.neighbor_pairs - 0.0s
   torchani.aev.neighbor_pairs_nopbc - 3.7s
   torchani.aev.triu_index - 0.0s
   torchani.aev.cumsum_from_zero - 0.1s
   torchani.aev.triple_by_molecule - 3.2s
   torchani.aev.compute_aev - 9.9s
Total AEV - 10.0s
Forward - 18.1s
Backward - 4.0s
Optimizer - 4.3s
Others - 1.7s
Epoch time - 38.1s

Without clamp:

  torchani.aev.cutoff_cosine - 0.2s
   torchani.aev.radial_terms - 0.3s
   torchani.aev.angular_terms - 0.7s
   torchani.aev.compute_shifts - 0.0s
   torchani.aev.neighbor_pairs - 0.0s
   torchani.aev.neighbor_pairs_nopbc - 3.6s
   torchani.aev.triu_index - 0.0s
   torchani.aev.cumsum_from_zero - 0.1s
   torchani.aev.triple_by_molecule - 3.2s
   torchani.aev.compute_aev - 9.8s
Total AEV - 9.8s
Forward - 18.0s
Backward - 4.0s
Optimizer - 4.3s
Others - 1.7s
Epoch time - 37.9s

Clamp should affect only angular terms so there is essentially no performance dip (~ 0.1 s maybe)

IgnacioJPickering commented 3 years ago

@yueyericardo How does the cuda test work? It is randomly failing for me

yueyericardo commented 3 years ago

@yueyericardo How does the cuda test work? It is randomly failing for me

not sure, looks a rerun passed