graeter-group / grappa

A Machine Learned Molecular Mechanics Force Field with integration into GROMACS and OpenMM
https://arxiv.org/abs/2404.00050
GNU General Public License v3.0
33 stars 0 forks source link

Refactor Interface #8

Closed ehhartmann closed 8 months ago

ehhartmann commented 1 year ago

Ideas:

Implementation:

ehhartmann commented 1 year ago

Do you even need the grappa class for training? To me, it would just be a wrapper of the model for prediction. Initializing the model for training could be completely separate. Maybe the grappa class would use the same function to init the model.

LeifSeute commented 1 year ago

Yep, the grappa class would not be needed for training or evaluation, only for inference. However, there will be a get_model(config) function in utils that returns a model with randomly initialized weights. This will also be used for training and the grappa class can import it for the constructor that accepts a config file/dict.

I would suggest that this utils function already uses the 'weights location' entry of the config and implements the torch hub download. This way we can easily fine-tune models that are initialized from a given state.

ehhartmann commented 1 year ago

Would be nice if you give me an overview of the tests you have and which ones should pass with the current code

ehhartmann commented 12 months ago

I forgot about the graph part of grappa....

Having a MolecularGraph class that is initialized with the input class (Molecule) and then parameterized using the model would be nice, I think. SysWriter looks like it does the job of handling the graph at the moment but it looks confusing to me. Do you think we can simplify it?

Also I think we need to store the mapping for our one-hot encodings for prediction. Do you already have a solution for that?

LeifSeute commented 12 months ago

This will be done completely new and independent of openmm. For representing the molecule, I would write a class that is essentially a typed dict + save and load methods

ehhartmann commented 12 months ago

Typed dict + save and load methods sounds like a dataclass.

My point is that there are two molecule representations. First, the arrays of atomnrs, elements, bonds, angles, impropers etc. and then the graph of it. Are the operations on the graph so complex that it needs to be an own class?

LeifSeute commented 12 months ago

In case you refer to the old PDBMolecule class: all of these old classes will be deleted, we don't really need them.

I think there is no difference between a bond list and a graph. So I completely agree. Our class for representing the idx lists should be as simple as possible. In the ML part of grappa, these lists will be used to create a dgl.Graph, but this is as simple as handing over the bond list to a dgl constructor.

(The only thing we have to do there is to map our atoms list, e.g. [2, 4, 7] to an enumeration [1,2,3] and then back after inference)

Btw. I think we should have a consistent naming convention for the idx lists. I suggest: atoms, bonds, angles, propers, impropers for the index tuple lists partial_charges, epsilons, sigmas, additional_features for the others