Closed RaulPPelaez closed 9 months ago
I can say something here. Standardize was only used in (r)MD17 (and it didn't even make sense). Since MD17 consists in training on just one system, people were using at some point the standard deviation and the mean to make learning easier. Of course this does not apply when building general NNPs, and one can argue that it does not even make sense in the single system case. To the best of my knowledge, it has never been used apart from (r)MD17. The way to go with NNPs, imo, is by means of atomic reference energies (even if they are learnable), because this can scale to arbitrary systems. Since you are also currently having the discussion about Atomref, my point would be that having Atomref-like behavior inside the full model makes infinitely more sense than having standardize. Also notice that in the case that no Atomref is provided in the dataset, one can build an Atomref by taking for example the mean energy per atom (or even per element, some people do that) going through the whole dataset, with learnable rescaling and shifting parameters per element. In that case, even though it is defined in terms of the dataset, the predictions can be scaled to arbitrary systems.
I agree that atom references are a lot more powerful than simply removing the mean of the total energy but I would argue that scaling to unit stddev can still be useful. Especially when not really thinking about the energy units (about which torchmd-net is agnostic) there should be a way to ensure proper scaling of the target values. Bad value ranges can really mess with training efficiency.
I cannot come up with any use of standardize apart from rMD17, and the abstract case of scaling the energies to std one. In the reality you will never have a std to refer to when doing inference, it is an artifact of the training set. Allegro’s paper has a nice explanation on target normalizations.
On Mon, 12 Feb 2024 at 21:08, Philipp Thölke @.***> wrote:
I agree that atom references are a lot more powerful than simply removing the mean of the total energy but I would argue that scaling to unit stddev can still be useful. Especially when not really thinking about the energy units (about which torchmd-net is agnostic) there should be a way to ensure proper scaling of the target values. Bad value ranges can really mess with training efficiency.
— Reply to this email directly, view it on GitHub https://github.com/torchmd/torchmd-net/issues/277#issuecomment-1939478991, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANJMOA645QAJPMZCBCD4I2DYTJZDTAVCNFSM6AAAAABDFDV3B2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZZGQ3TQOJZGE . You are receiving this because you commented.Message ID: @.***>
Thanks for the wonderful insights guys.
It is becoming clear to me that we should:
for me it is the most elegant and general way to proceed
The documentation also requires the addition of a section like "Dataset standardisation". I will add it in the PR for this, will defo need yo to pour your knowledge into it.
I will be happy to help
When enabled, the standardize functionality processes the whole Dataset and stores the mean and std of the energies (well, the "y" field) in it:
https://github.com/torchmd/torchmd-net/blob/6d8e3159cfb8bb971ecf7a2abd589735d79a7e53/torchmdnet/data.py#L172-L202
These are then used during prediction:
https://github.com/torchmd/torchmd-net/blob/6d8e3159cfb8bb971ecf7a2abd589735d79a7e53/torchmdnet/models/model.py#L358-L375
There are some issues we should consider with its current state:
I am opening this issue to start a discussion on what to do about these.