Open bondrewd opened 12 months ago
I am leaving here OpenMM's implementation for the MARTINI 3 parser as a reference:
https://github.com/maccallumlab/martini_openmm/blob/master/martini_openmm/martini.py
I think MARTINI would be a great addition to Molly. It would stretch our API too - it seems they use things like virtual sites and CMAP torsions? - so would be a useful opportunity to change the API to support these systems. Can MARTINI be implemented with pairwise and specific interactions for instance, or is a general interaction required?
There is an experimental Gromacs file parser already in src/setup.jl but it needs a bit of work. You might be able to make that more robust and allow it to parse the MARTINI force field?
PME is on the todo list, I've been avoiding it as it is a significant block of work and I haven't needed it myself yet, but that would be a great contribution too. Along with constraints and GPU performance, both of which are being worked on, that is the last thing standing in the way of proper biomolecular simulation.
I can add an example based on my results after the publication.
Sounds good.
Great! Then I will start working on the MARTINI 3 implementation for Molly. Here is a list of the steps I will follow:
.top parser
.itp
files.top parser
.gro parser
.gro parser
virtual site interaction
virtual site interaction
MARTINI 3
constructor for the System
typeMARTINI 3
MARTINI 3
implementationConsidering my current schedule, I might be able to finish it probably by the end of the year (~3 to ~4 months).
Regarding cmap
, we need it when using the CHARMM force field. Unless someone else is interested in developing it, I can work on the CHARMM (and AMBER) force fields, one at a time, after finishing the MARTINI 3 implementation.
Looks like a good set of steps. As you have been doing so far, submit smaller self-contained PRs (e.g. one for .top
, one for .gro
) so we can discuss any design decisions as they arrive. One would be whether to use a similar two-step procedure as for OpenMM setup, i.e. construct a MolecularForceField
with the .top
file and then construct a System
with the coordinate file and the force field. If MolecularForceField
can be modified to take all the required data then this would be my preferred approach, since it is a consistent API and opens the door for things like converting force field files later on.
You may also get some use out of Chemfiles.jl, which we already use for the OpenMM setup and which supports .gro
files, but don't feel you have to if it is missing required features. I think it doesn't read in the topology, for instance.
Yes CMAP needs doing at some point so feel free. The Amber force fields without CMAP should work currently, minus things like PME and virtual sites.
I'm also in for the MolecularForceField
interface. As you mentioned, having a structure (MolecularForceField
) as an entry point for the force fields facilitates the interconversion between different models and the combination of different interactions (for example, multi-scale simulation using a CG and a AA region).
This approach can be used even with the custom interactions to stabilize the API of the System
structure. For example, instead of feeding custom interactions directly into the System
API (a) or a MolecularForceField
when using an existing force field (b), we could feed everything into a MolecularForceField
instead and then pass it to the System
constructor (c).
@jgreener64 What do you think about this idea? In the meantime, I will start updating the MolecularForceField
structure. I already have a lexer for the Gromacs .top
and .itp
files. My next step is implementing a parser for building a MolecularForceField
structure before submitting the first PR.
Regarding using Chemfiles.jl
, it seems it is best suited for working with systems at the trajectory level. The nice thing is that apart from .gro
files, we also get access for free to all the other supported formats. Additionally, using a library that covers all the details of each format frees us from a painful load of work (try implementing a .dcd
reader/writer that works across different software...). The only downside (?) is the introduction of a zero-indexing dependency. Of course, this is not a limitation, but having to work with one-indexing and zero-indexing in the same codebase increases the complexity and makes it easier to introduce a bug accidentally. Ultimately, it is a matter of weighing the pros and cons.
Related discussion in https://github.com/JuliaMolSim/Molly.jl/issues/85.
I prefer having a separation between the force field object and an object for the force field applied to a particular system. In this case specific interactions and constraints contain information on the topology, pairwise interactions generally don't, and general interactions may or may not. I would be in favour of putting those 4 things into a struct called something like Interactions
as proposed in the other issue, and passing that to System
(or pass a MolecularForceField
and coordinate file, like we do currently, and create an Interactions
during setup). It's not a burning issue though and would be a very breaking change, so would require careful thought.
Thanks for pointing out the relevant issue! Now that I think more about it and following what you mentioned, designing the MolecularForceField as a step before the Interactions
structure makes the code more flexible. Especially since it allows adding interactions on top of an existing force field (pulling simulations, external fields, umbrella sampling, etc.).
@jgreener64 Hi! I just wanted to mention that I've been working on this PR slowly these last two months because it was very hectic for me, but from the 15th of this month, I will be able to resume my previous working rhythm. Bests!
Nice one!
Hi @jgreener64! I'm currently working on improving the Gromacs topology processing pipeline. I've divided it into a lexer and a parser. While working on creating structures for the parser output, I noticed that Martini doesn't utilize all the available fields as listed in the Gromacs documentation. What are your thoughts on incorporating support only for the Gromacs directives and fields that Martini actually uses?
For instance, the Gromacs 'default' directive includes 6 fields, but Martini only makes use of 2 of them.
Given that our primary goal is to support the Martini force field and not the general Gromacs topology format, I believe that this initial implementation should suffice to achieve our objective. We can always consider adding support for additional directives if it's requested or required in the future.
Yes that sounds fine. We do want a path to supporting most things eventually, but it can be progressive. I think impementing what is required for Martini should also provide most/all of what is required for all-atom protein simulation?
I have in mind two possible target systems to use as an example for the membrane barostat:
Each of these examples requires further development:
The easiest one is implementing a MARTINI 3 example. I can work on the required parsers and send a PR. @jgreener64, what do you think?
On the other hand, I am currently using the barostat with my own coarse-grained model. As I mentioned to you before, I am trying to improve it by using Molly's differentiable simulation. I can add an example based on my results after the publication.