Closed David-Araripe closed 3 months ago
You're taking more or less the most direct API path to the charges[^1] but probably running into the fundamental issue of assigning charges with AM1-BCC, especially if your ligand(s) is large^2. In all likelihood, the time it takes to assign vdW and all valence parameters is less than the time it takes to assign partial charges. So it's not you - it's AM1!
If you're using only free and open-source tools, AmberTool's sqm
is assigning the partial charges. OpenEye's commercial tools can do the same thing[^3] but faster, and they offer free licenses for academics. (If OpenEye Toolkits are installed and properly licensed, ELF10 will be used in these sorts of code paths).
In the future, directly running AM1-BCC may be replaced by a lightning-fast graph neutral net from NAGL, but that's not released yet.
Until then, you could just use the Molecule
API to assign partial charges directly, and save those out to a file:
molecule.assign_partial_charges(partial_charge_method="am1bcc") # or "am1bccelf10" if you have OpenEye licesed
molecule.to_file('ligand_with_charges.sdf', file_format='sdf')
Even though it's hidden away, this is basically what Interchange ultimately does when using the high-level calls like ForceField.create_openmm_system
.
[^1]: There is also ForceField.get_partial_charges
, which parameterizes the entire force field and just returns the partial charges in one go. Looking at your code makes me realize it could skip the other handlers and just go straight for the electrostatics ...
[^3]: Okay, it's actually a slightly different variant of AM1-BCC, but it's similar and people have argued it's better. You can read more here: https://openforcefield.github.io/standards/standards/smirnoff/#toolkitam1bcc-temporary-support-for-toolkit-based-am1-bcc-partial-charges
Thank you very much for the very detailed answer! I tried approach 1) you mentioned;
There is also ForceField.get_partial_charges, which parameterizes the entire force field and just returns the partial charges in one go. Looking at your code makes me realize it could skip the other handlers and just go straight for the electrostatics ... ↩
and I noticed a significant speed-up. That is what I was looking for! Thanks! I'm closing this issue since your answer already solved it.
Happy to hear this is working, though I admit I'm a little surprised. There shouldn't be a functional difference in timings - if anything, ForceField.get_partial_charges
should be (edit) slower!
Are you able to share examples of ligand(s) you're working with, or maybe some general features (size, chemistry, whether it's a bunch of rings or not, etc.)? It's fine if you can't, I can run comparisons on some public datasets.
Hi @mattwthompson, I will make a more robust test here by timing how much each of the approaches takes to get the ligand's charges. I will come back to you hopefully by the end of the day (CET) here!
Is there any caching being done in the background that I should be aware of while testing this?
There's a little bit of caching. Interchange caches charges based on the parameters of this function: https://github.com/openforcefield/openff-interchange/blob/v0.3.25/openff/interchange/smirnoff/_nonbonded.py#L464-L476
Which is to say that presenting the same Molecule
object with the same mappes SMILES and the same partial charge method in the same Python process, the result will be looked up from a cache instead of re-computed. Since ForceField.get_partial_charges
calls Interchange, so that may explain the speedup you saw if you were calling it from the same interpreter. (Note I had a typo above, I'd expect this to be slower than your original approach on the first call, though I haven't tried it yet).
There's no caching if using (only) the Molecule
API.
Hi @mattwthompson thanks for your patience and support. I ran a comparison, this time on two different Python processes to make sure I don't have caching. Here are the results, so you were right about assign_partial_charges
being a slower approach than calling directly _electrostatics
.
Please let me know if you're also interested in which molecular structures I used for this. I ran both of them locally (but one after the other) while I was using the machine so it might not be a 100% accurate experimental, but I think it can give a good idea
This looks really cool! Are the blue points on the right from Molecule.assign_partial_charges
or ForceField.get_partial_charges
?
Oh, I didn't realize they would be different on the back-end, otherwise I would've included it in the test. But those points were from ForceField.get_partial_charges
Okay, great. Molecule.assign_partial_charges
should be the quickest of them all (if marginal) so I wanted to be sure.
I suppose we can attribute the earlier result to something like run-to-run variance or (waves hands) magic happening under the hood. There's some magic that happens with sqm
that causes slight differences at runtime (#1842 being a crazy and recent example) which are likely minor but a hassle to understand deeply.
Thanks for satisfying my curiosity on this - I have nothing more to ask of you, but please reach out if you run into any future questions about using OpenFF with your work!
I'm using openff 2.1.0 to parametrize some ligands for molecular dynamics simulations. For this, I also need the
Electrostatics
of the interchange object. However, I noticed that the parametrization takes quite a long time to run.Is there a more straightforward (and faster) way to get these values? I'm not going to use the
Interchange
object later on, so I would just like to extract the Electrostatics values that are returned. I also tried doing this with multiple threads, but there was an error accessing other tools in the toolkit required for calculating the electrostatics.Here's how I'm currently doing it:
Do you have any feedback on how could I do this more efficiently?