lanl / hippynn

python library for atomistic machine learning
https://lanl.github.io/hippynn/
Other
59 stars 22 forks source link

Nodes, docs, and an example for excited states #42

Closed tautomer closed 7 months ago

tautomer commented 9 months ago

One thing I changed that you may not like is

            # charges contain multiple targets, so set up broadcasting
            charges = charges.unsqueeze(2)
            positions = positions.unsqueeze(1)

Line 46 and 47 in layers/physics.py.

I did this so that dipoles and NACR are both (n_molecules, n_states, n_dims). Before, dipole was (n_molecules,n_dims, n_states) Let me what do you think.

lubbersnick commented 9 months ago

Thank you!

I'm not sure if we had an example that trained to multiple types of dipoles before, so maybe it's not a worry to change how that works - so long as it is documented?. A good thing for us to do to catch on this kind of thing is see if we can modify the ANI1x training to do dipole-based charge assignment training (aka ACA method).

In terms of the locations of the objects I wonder if it makes sense to make an "excited_states" node/layer files because of how your various pieces are related to each other. What do you think about that possibility? (it's a question for discussion, not sure what is best).

Thank you for including documentation!

There are some conflicts appearing here (such as CombineEnergies not appearing on your PR), this will need to be fixed before we could merge.

One aspect here I am not totally set on is the name "NACR" - if this is extremely standard we could use it, but it is not very verbose, so anyone unfamiliar will not have any idea what these objects are or how they work.

Can you edit the changelog to summarize your updates?

tautomer commented 9 months ago

I'm not sure if we had an example that trained to multiple types of dipoles before, so maybe it's not a worry to change how that works

Single-target will be the same, and all multi-target will be (n_molecule, n_targets, 3). Only existing multi-target dipole training will be affected, as the corresponding dataset would be (n_molecule, 3, n_targets). Do we even have this case before?

it's not a worry to change how that works - so long as it is documented?

We probably don't have this piece of information in the docs at all. Maybe I should add it. I have mentioned this in the example.

A good thing for us to do to catch on this kind of thing is see if we can modify the ANI1x training to do dipole-based charge assignment training (aka ACA method).

I'm not familiar with this part. Maybe I can ask Sakib.

In terms of the locations of the objects I wonder if it makes sense to make an "excited_states" node/layer files because of how your various pieces are related to each other. What do you think about that possibility? (it's a question for discussion, not sure what is best).

You mentioned this before. I actually had all functions/classes in one file as well. Either way has its own advantage. As of now, the locations follow the logic of the whole package, loss related in loss, layers in layers, etc. Throwing them in one file is easier to manage. Also, these functions hardly talk to anything else, and if there is any change, it will likely within the file. The only thing I'm unsure of is that where to keep this file, as the objects don't belong to any single top-level folder.

Thank you for including documentation!

Just some basic doc strings and inline comments, but the shape and type of the important variables should be clear to other users.

There are some conflicts appearing here (such as CombineEnergies not appearing on your PR), this will need to be fixed before we could merge.

I will take care of them tomorrow. Just some changes from Sakib are missing from my fork and some strings formatted differently.

One aspect here I am not totally set on is the name "NACR" - if this is extremely standard we could use it, but it is not very verbose, so anyone unfamiliar will not have any idea what these objects are or how they work.

Thanks for pointing this out. NACR (Non-Adiabatic Coupling vectoR) is always used within Sergei's academic family tree. Many others directly use NAC to refer to NACR and use something else for the scalar (Non-Adiabatic Coupling Term, NACT for us). Another common one is derivate coupling (DC) for NACR. It can indeed be a problem for other people. Is NonAdiabaticCouplingVector way to verbose? I can talk to Yu as well and see if he has any suggestions.

Can you edit the changelog to summarize your updates?

I totally forgot the changelog part. I will do it tomorrow.

lubbersnick commented 9 months ago

It can indeed be a problem for other people. Is NonAdiabaticCouplingVector way to verbose? I can talk to Yu as well and see if he has any suggestions.

I think either NonAdiabaticCoupling or NonAdiabaticCouplingVector would both be fine for class names.

tautomer commented 9 months ago

It can indeed be a problem for other people. Is NonAdiabaticCouplingVector way to verbose? I can talk to Yu as well and see if he has any suggestions.

I think either NonAdiabaticCoupling or NonAdiabaticCouplingVector would both be fine for class names.

Then NonAdiabaticCoupling? A little shorter and we can clearly state it is for the vector in the doc string. In fact, you cannot directly predict the scalar (it is basically NACR dot velocities), so there should not be confustions.

We will have

hippynn.layers.physics.NonAdiabaticCouplingMultiState hippynn.layers.physics.NonAdiabaticCoupling hippynn.graph.nodes.NonAdiabaticCouplingMultiStateNode hippynn.graph.nodes.NonAdiabaticCouplingNode

Should we keep all of them in the same file?

lubbersnick commented 8 months ago

@tautomer I moved the new functions around and grouped them with our other excited state type node. I also updated the documentation. Are there any outstanding issues?

tautomer commented 8 months ago

@lubbersnick Hi, Nick. Sorry for the delay. I'm totally fine with your edits.