isayev / ASE_ANI

ANI-1 neural net potential with python interface (ASE)
MIT License
220 stars 56 forks source link

Clarification on training the model #17

Closed siddharthal closed 6 years ago

siddharthal commented 6 years ago

The paper describes an example of H2O, where the total energy is sum of individual contributions obtained from the two hydrogens and an oxygen. Could you clarify a bit on how the loss is propagated backwards when the model used for the two hydrogens is the same, as only the total energy is available and the individual contributions are unknown while training the model ?

Jussmith01 commented 6 years ago

Back propagation starts from the gradient of the loss function and back propagates as it with any basic neural network model. The chain rule gives dC/dE_t * dE_i/dw (for E_i from E_t=sum(E_i)) which takes back propagation into the individual networks. Gradients are reduced over atoms which share the same atomic symbol. The individual contributions are a bi-product of this process and are not fit to directly. These values have no physically analogous value, and therefore represent the partitioning of the energy that the training process converged to. These may or may not be consistent from trained model to trained model. Let me know if this needs a better explanation.

siddharthal commented 6 years ago

Thanks a lot. That clarifies.