choderalab / openmoltools

An open set of tools for automating tasks relating to small molecules
MIT License
63 stars 30 forks source link

Update openeye.get_charges to allow retaining of provided conformer #146

Closed davidlmobley closed 9 years ago

davidlmobley commented 9 years ago

Currently, the openeye.get_charges function has an undocumented "feature" in that it generates entirely new conformations for the provided molecule and returns those along with the new charges (https://github.com/choderalab/openmoltools/blob/9cf5f615ce90e25eea75366e4f8ad9e8aa7cf9c2/openmoltools/openeye.py#L13).

This is often fine for solution-phase calculations where conformational interconversion times are short, but if we are looking at something like protein-ligand binding where we might want to start with the ligand already placed into the binding site, this is not correct.

The openeye.get_charges function needs to be updated so that: a) the default behavior is to calculate charges as it currently does, and copy these back onto the provided conformation b) optionally it can provide the calculated charges on entirely new conformations (as at present)

davidlmobley commented 9 years ago

@hanhnguyen14 - this is the issue you brought up.

kyleabeauchamp commented 9 years ago

So afaik the current default is based on best practices according to chris bayly On Jul 13, 2015 12:51 PM, "David L. Mobley" notifications@github.com wrote:

@hanhnguyen14 https://github.com/hanhnguyen14 - this is the issue you brought up.

— Reply to this email directly or view it on GitHub https://github.com/choderalab/openmoltools/issues/146#issuecomment-121036448 .

davidlmobley commented 9 years ago

@kyleabeauchamp - I think you misunderstand the issue. The current default indeed implements Bayly's best practices for charge calculation, and I don't propose changing how the charges are calculated. What I propose changing is what conformer is returned. That is, I ought to be able to get back the charges applied to my existing conformer if I want.

Imagine, for example, that I pull a PDB ligand conformation, calculate charges on the ligand, and then use it to run a simulation of the ligand on the binding site - expecting that I'll be simulating the ligand starting with the current bound conformation. Currently, this in fact would not be what I get because the charge calculation procedure actually changes the conformation.

The expected behavior would naively be that the charge calculation procedure just calculates charges (via Bayly's best practices) and then applies those charges to whatever conformation I've provided (i.e. my ligand conformation from the PDB).

So the issue is just which conformation is returned, NOT how the charges are obtained.

jchodera commented 9 years ago

@davidlmobley: This is the official, textbook, 100% Christopher Bayly recommended method for "Generating Canonical AM1-BCC Charges" scheme direct from his OpenEye Toolkit Cookbook article: https://docs.eyesopen.com/toolkits/cookbook/python/modeling/am1-bcc.html

We have been assured that the conformation generation scheme is integral to producing the correct charges in this scheme. The scheme you describe, while seemingly sensible, is not the "Canonical AM1-BCC Charge" scheme we should be recommending to users.

davidlmobley commented 9 years ago

@jchodera , @kyleabeauchamp : You guys are STILL missing my point. I am NOT talking about changing the charge calculation recipe. Bayly's recipe refers to how to calculate charges, and I 100% agree with it. HOWEVER, the current function not only calculates charges via Bayly's recipe, it ALSO modifies the conformation of the molecule which was provided, which is an entirely separate issue which is not part of his recipe.

Let me summarize my point by comparing the two implementations in question:

Current implementation: A) Take input conformation from user B) Generate lots of conformations C) Apply Bayly recipe to get charges D) Discard the user's input conformation and return Bayly's charges applied to the generated conformation(s)

Proposed implementation (with an option to still use the above implementation): A) Take input conformation from user B) Generate lots of conformations C) Apply Bayly recipe to get charges D) Return Bayly's charges applied to the supplied conformation

Again, I am not proposing changing the conformation generation scheme, NOR the recipe for calculating charges. I am only proposing changing the aspect wherein the user-supplied conformation is discarded whether they want it or not. It absolutely should not be used in the charge calculation, but the user might want to continue subsequent work from this conformation!

The main reason this is an issue is, as I said, that I may want to begin a simulation from a specific conformation of a molecule (as Chris Bayly himself does frequently). I should be able to calculate charges via Bayly's recipe and APPLY them to my desired conformation.

To put it one final way - a charge calculation function ought to calculate charges and assign them to my molecule. It ought not to modify the conformation of my molecule, unless I ask it to. ABSOLUTELY it needs to follow Bayly's recipe for GENERATING the charges (and this includes conformer generation). But the user doesn't need his INPUT conformation modified.

jchodera commented 9 years ago

HOWEVER, the current function not only calculates charges via Bayly's recipe, it ALSO modifies the conformation of the molecule which was provided, which is an entirely separate issue which is not part of his recipe.

D'oh. I somehow missed that the "unintended side effect" was "modifying the input molecule". This is an easy fix. Hold on for a PR.

davidlmobley commented 9 years ago

Yeah, I was going to do it myself shortly (already have the code, just dealing with some travel issues).

jchodera commented 9 years ago

OK, let me know if you want me to do it.

The current argument keep_confs says:

    keep_confs : int, optional, default=None                                                                                                                                                                                                                 
        If not None, only keep this many conformations in the final                                                                                                                                                                                          
        charged OEMol.  Multiple conformations are still used to *determine*                                                                                                                                                                                 
        the charges.  For example, keep_confs=1 will return an OEMol with                                                                                                                                                                                    
        just a single conformation, while keep_confs=None returns all the                                                                                                                                                                                    
        conformations generated by Omega.    

I think we want to

  1. Have this default to not overwriting the original conformation if None is specified
  2. Somehow prevent this from breaking any code that relied on the old keep_confs=None that would return all 800 conformations.
jchodera commented 9 years ago

I'm just not sure if changing the default behavior will break any of @kyleabeauchamp's code.

davidlmobley commented 9 years ago

My initial plan was to just make it do the first of those, and break the old code (since @kyleabeauchamp has mostly seemed OK with that for these types of issues) if it came to that.

I guess if we desperately want to do it without breaking the old code, we can return 801 conformations if keep_confs = None, with the user-provided one being the first. But that doesn't seem very appealing, partly because it will result in cumbersome workflow problems down the line (i.e. now we'll have to throw away 800 conformations).

I suppose if we want to do it without breaking old code, we probably ought to add a third option, keep_confs = True which will return only the original conformation(s) with the charges applied to those. But I'd rather break the old code as the current approach is not intuitive, in my view.

jchodera commented 9 years ago

I vote for breaking backward-compatibility here unless @kyleabeauchamp objects.

kyleabeauchamp commented 9 years ago

No objections

On Mon, Jul 13, 2015 at 5:14 PM, John Chodera notifications@github.com wrote:

I vote for breaking backward-compatibility here unless @kyleabeauchamp https://github.com/kyleabeauchamp objects.

— Reply to this email directly or view it on GitHub https://github.com/choderalab/openmoltools/issues/146#issuecomment-121096539 .