openforcefield / smarty

Chemical perception tree automated exploration tool.
http://openforcefield.org
MIT License
19 stars 8 forks source link

Molecular fragmentation scheme for charging large molecules #109

Open jchodera opened 7 years ago

jchodera commented 7 years ago

We will need a scheme for fragmenting large molecules into smaller electronically decoupled components to allow us to charge large molecules (such as biomolecules).

davidlmobley commented 7 years ago

FTR, I believe when we discussed this with Christopher, he indicated that:

jchodera commented 7 years ago

I'm not sure if "no solution" is better than "a crude solution" at this point. At least a crude solution is evaluable, and more sophisticated solutions can be benchmarked against it.

For example, if we fragment molecules according to Bemis-Murcko scaffolds, charge the fragments, glue them together, and scale the charges to equal the net formal charge, at least this is an evaluable scheme that can give us a working method for biopolymers, large molecules, and covalent adducts.

davidlmobley commented 7 years ago

I thought our goal at this point was "working small molecule forcefield"? I don't understand why we need to be thinking about larger molecules yet. Can you explain?

jchodera commented 7 years ago

There are some larger small molecules that present a challenge for the typical charging strategies. Even supramolecular hosts like CB7 are challenging.

In addition, covalent inhibitors present a huge challenge to current parameterization schemes.

Perhaps more importantly, though, without a strategy, it's unclear how our method can scale beyond small molecular liquids. It would be useful to at least have a vision for this now, even if we revise the precise implementation later.

davidlmobley commented 7 years ago

I agree that we have to deal with this, and no one's suggesting that we not do anything about it. The question is, "when?" My worry is just that if we invest time in it now it will impair our ability to get this effort "off the ground" in terms of finishing enough on small molecules to be able to motivate people to fund the effort and ensure its viability and continuation.

In other words, realistically we have to choose between maximizing the amount of time we spend getting a "working small molecule forcefield" now, versus figuring out how other pieces of the effort will work IF we get to do them. I don't want to end up in a situation where we have great plans for how we would deal with everything (such as extending it to large molecules, etc.) but we never get the opportunity to implement because we spent too many resources on long term plans and not enough on proof-of-principle for small molecules.

If there's a design issue we need to deal with now to avoid having to backtrack later, that's another issue, of course. I just don't see what it is yet... Do you?

jchodera commented 7 years ago

The major immediate issue is that we need to agree on a scheme for charging small molecules in the ThemoML dataset.

The major intermediate issue is the potential need to extend smarty to BCC sampling as part of our proof of concept.

The major long term issue is a demonstration that we can scale our approach to handle biomolecular systems of pharmaceutical interest. We just need a guiding principle that can be refined, and if an implementation is quick, a proof of concept. Retreating to "we're waiting for someone else to solve this problem" isn't an acceptable position. We don't need to have the final solution---the the first steps of a clear path.

We definitely want to avoid distraction, but that is why it is important to get out ahead of this issue now and talk through a plan. We can decide on the optimal time to implement the first steps, but we need a plan.