Refactor Interface - Githubissues

graeter-group / grappa

A Machine Learned Molecular Mechanics Force Field with integration into GROMACS and OpenMM

https://arxiv.org/abs/2404.00050

GNU General Public License v3.0

33 stars 0 forks source link

Refactor Interface #8

Closed ehhartmann closed 8 months ago

ehhartmann commented 1 year ago

Ideas:

Separate input generation from prediction
Make models available outside HITS

Implementation:

write an input dataclass
- lists of atoms, bonds, angle, proper, improper are attributes
- a method shall generate angle and proper from bonds (maybe as post init, if angle and proper are empty)
- additional_features are also an attribute and are a dict with name: array of shape n_atoms x feat_dim
- has a validate method to check the input
write a parameterization "grappa" class
- can be initialized with model
- has classmethod to initialize from config
- config contains architecture and location of weigths
  - weigths are stored as state dict and are accessed via torch.hub or local path
- has a "predict" method that takes an input dataclass instance and returns a output dataclass
- method has to check whether input and model are compatible because of the variable amount of additional features

ehhartmann commented 1 year ago

Do you even need the grappa class for training? To me, it would just be a wrapper of the model for prediction. Initializing the model for training could be completely separate. Maybe the grappa class would use the same function to init the model.

LeifSeute commented 1 year ago

Yep, the grappa class would not be needed for training or evaluation, only for inference. However, there will be a get_model(config) function in utils that returns a model with randomly initialized weights. This will also be used for training and the grappa class can import it for the constructor that accepts a config file/dict.

I would suggest that this utils function already uses the 'weights location' entry of the config and implements the torch hub download. This way we can easily fine-tune models that are initialized from a given state.

ehhartmann commented 1 year ago

Would be nice if you give me an overview of the tests you have and which ones should pass with the current code

ehhartmann commented 12 months ago

I forgot about the graph part of grappa....

Having a MolecularGraph class that is initialized with the input class (Molecule) and then parameterized using the model would be nice, I think. SysWriter looks like it does the job of handling the graph at the moment but it looks confusing to me. Do you think we can simplify it?

Also I think we need to store the mapping for our one-hot encodings for prediction. Do you already have a solution for that?

LeifSeute commented 12 months ago

This will be done completely new and independent of openmm. For representing the molecule, I would write a class that is essentially a typed dict + save and load methods

ehhartmann commented 12 months ago

Typed dict + save and load methods sounds like a dataclass.

My point is that there are two molecule representations. First, the arrays of atomnrs, elements, bonds, angles, impropers etc. and then the graph of it. Are the operations on the graph so complex that it needs to be an own class?

LeifSeute commented 12 months ago

In case you refer to the old PDBMolecule class: all of these old classes will be deleted, we don't really need them.

I think there is no difference between a bond list and a graph. So I completely agree. Our class for representing the idx lists should be as simple as possible. In the ML part of grappa, these lists will be used to create a dgl.Graph, but this is as simple as handing over the bond list to a dgl constructor.

(The only thing we have to do there is to map our atoms list, e.g. [2, 4, 7] to an enumeration [1,2,3] and then back after inference)

Btw. I think we should have a consistent naming convention for the idx lists. I suggest: atoms, bonds, angles, propers, impropers for the index tuple lists partial_charges, epsilons, sigmas, additional_features for the others