Open ntBre opened 1 year ago
It looks like the error stems from conformer generation, not charge assignment per se. I wonder if any of these settings could be fiddled with? The toolkit could do a better job reporting what went wrong (or maybe Omega doesn't provide detail)?
In [6]: mol = Molecule.from_mapped_smiles("[H:16][c:1]1[c:3]([c:7]([c:11]([c:8]([c:4]1[H:19])[H:23])[C:14]([H:29])([H:30])[N@:1
...: 5]([c:12]2[c:9]([c:5]([c:2]([c:6]([c:10]2[H:25])[H:21])[H:17])[H:20])[H:24])[C:13]([H:26])([H:27])[H:28])[H:22])[H:18]"
...: )
In [7]: from openff.toolkit.utils.toolkits import OpenEyeToolkitWrapper
...: oe = OpenEyeToolkitWrapper()
...: oe.assign_partial_charges(mol, partial_charge_method="am1bccelf10")
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
---------------------------------------------------------------------------
ConformerGenerationError Traceback (most recent call last)
Cell In[7], line 3
1 from openff.toolkit.utils.toolkits import OpenEyeToolkitWrapper
2 oe = OpenEyeToolkitWrapper()
----> 3 oe.assign_partial_charges(mol, partial_charge_method="am1bccelf10")
File ~/mambaforge/envs/openff-interchange-env/lib/python3.11/site-packages/openff/toolkit/utils/openeye_wrapper.py:2420, in OpenEyeToolkitWrapper.assign_partial_charges(self, molecule, partial_charge_method, use_conformers, strict_n_conformers, normalize_partial_charges, _cls)
2418 mol_copy._conformers = None
2419 else:
-> 2420 self.generate_conformers(
2421 mol_copy,
2422 n_conformers=charge_method["rec_confs"],
2423 rms_cutoff=0.25 * unit.angstrom,
2424 make_carboxylic_acids_cis=True,
2425 )
2426 # TODO: What's a "best practice" RMS cutoff to use here?
2427 else:
2428 mol_copy._conformers = None
File ~/mambaforge/envs/openff-interchange-env/lib/python3.11/site-packages/openff/toolkit/utils/openeye_wrapper.py:2182, in OpenEyeToolkitWrapper.generate_conformers(self, molecule, n_conformers, rms_cutoff, clear_existing, make_carboxylic_acids_cis)
2180 new_status = omega(oemol)
2181 if new_status is False:
-> 2182 raise ConformerGenerationError(
2183 "OpenEye Omega conformer generation failed"
2184 )
2186 molecule2 = self.from_openeye(
2187 oemol, allow_undefined_stereo=True, _cls=molecule.__class__
2188 )
2190 if clear_existing:
ConformerGenerationError: OpenEye Omega conformer generation failed
Ah yes, I should have mentioned that in the issue! The root problem is indeed the Omega conformer generation.
I also noticed that despite there being 52 records, there are only 13 unique molecules. So a smaller, complete test set is here:
[H:19][C:4]1([C:1]2=[C:2]([N:14]=[C:3]([N:15]2[C:11]([H:33])([H:34])[H:35])[N:17]3[C:8]([C:6]([C:5]([C:7]([C:9]3([H:29])[H:30])([H:25])[H:26])([H:21])[H:22])([H:23])[H:24])([H:27])[H:28])[N@:16]([C:10]([N@:18]1[C:13]([H:39])([H:40])[H:41])([H:31])[H:32])[C:12]([H:36])([H:37])[H:38])[H:20]
[H:19][C:4]1([C:1]2=[C:2]([N:13]=[C:3]([N:14]2[C:10]([H:31])([H:32])[H:33])[N:17]3[C:5]([C:7]([N:18]([C:8]([C:6]3([H:23])[H:24])([H:27])[H:28])[C:12]([H:37])([H:38])[H:39])([H:25])[H:26])([H:21])[H:22])[N@:16]([C:9]([N:15]1[H:40])([H:29])[H:30])[C:11]([H:34])([H:35])[H:36])[H:20]
[H:18][c:1]1[c:2]([n:14][c:4]([c:3]([n:13]1)[N@:15]2[C:5]([C@@:7]([C@:8]([C:6]2([H:22])[H:23])([C:10]([H:28])([H:29])[H:30])[O:17][H:37])([H:24])[C:9]([H:25])([H:26])[H:27])([H:20])[H:21])[N:16]([C:11]([H:31])([H:32])[H:33])[C:12]([H:34])([H:35])[H:36])[H:19]
[H:26][c:2]1[c:4]([c:3]([c:6]([n:22][c:5]1[C:19]([H:47])([H:48])[H:49])[N@@:23]2[C:15]([C@@:17]3([C:7](=[O:25])[N:24]([C:13]([C:8]([C:11]3([H:33])[H:34])([H:27])[H:28])([H:37])[H:38])[C:20]([H:50])([H:51])[C:16]4([C:9]([C:10]4([H:31])[H:32])([H:29])[H:30])[H:43])[C:12]([C:14]2([H:39])[H:40])([H:35])[H:36])([H:41])[H:42])[C:1]#[N:21])[C:18]([H:44])([H:45])[H:46]
[H:21][c:1]1[n:14][c:7]([c:2]2[c:8]([n:15]1)[O:20][N:16]=[C:4]2[C:9]([H:22])([H:23])[H:24])[N@:18]([C:12]([H:31])([H:32])[H:33])[C:13]([H:34])([H:35])[C:3]3=[C:6]([O:19][N:17]=[C:5]3[C:10]([H:25])([H:26])[H:27])[C:11]([H:28])([H:29])[H:30]
[H:23][c:1]1[c:2]([c:4]([n:16][c:5]([c:3]1[H:25])[N@:17]2[C:12]([C@:13]3([C:6](=[O:19])[N:18]([C:10]([C:7]([C:8]3([H:28])[H:29])([H:26])[H:27])([H:32])[H:33])[C:14]([H:38])([H:39])[H:40])[C:9]([C:11]2([H:34])[H:35])([H:30])[H:31])([H:36])[H:37])[C:15]([F:20])([F:21])[F:22])[H:24]
[H:26][c:2]1[c:3]([c:6]([n:22][c:5]([c:4]1[C:18]([H:44])([H:45])[H:46])[C:19]([H:47])([H:48])[H:49])[N@@:23]2[C:15]([C@@:17]3([C:7](=[O:25])[N:24]([C:13]([C:8]([C:11]3([H:33])[H:34])([H:27])[H:28])([H:37])[H:38])[C:20]([H:50])([H:51])[C:16]4([C:9]([C:10]4([H:31])[H:32])([H:29])[H:30])[H:43])[C:12]([C:14]2([H:39])[H:40])([H:35])[H:36])([H:41])[H:42])[C:1]#[N:21]
[H:25][c:1]1[c:2]([c:8]([c:12]([c:9]([c:3]1[H:27])[H:33])[N@@:22]([C:20]([H:44])([H:45])[H:46])[C:21]([H:47])([H:48])[c:11]2[c:6]([c:4]([c:10]([c:5]([c:7]2[H:31])[H:29])[C@:18]3([C@@:19]([C:17]([C:15]([C:14]([C:16]3([H:38])[H:39])([H:34])[H:35])([H:36])[H:37])([H:40])[H:41])([H:43])[C:13](=[O:23])[O:24][H:49])[H:42])[H:28])[H:30])[H:32])[H:26]
[H:25][c:1]1[c:2]([c:8]([c:12]([c:9]([c:3]1[H:27])[H:33])[N@@:22]([C:20]([H:44])([H:45])[H:46])[C:21]([H:47])([H:48])[c:11]2[c:6]([c:4]([c:10]([c:5]([c:7]2[H:31])[H:29])[C@:18]3([C@@:19]([C:17]([C:15]([C:14]([C:16]3([H:38])[H:39])([H:34])[H:35])([H:36])[H:37])([H:40])[H:41])([H:43])[C:13](=[O:24])[O-:23])[H:42])[H:28])[H:30])[H:32])[H:26]
[H:20][c:1]1[c:3]([n:16][c:7]([c:5]2[c:6]1[O:19][C:4](=[C:2]2[H:21])[H:23])[N@@:18]([C:12]([H:32])([H:33])[H:34])[C:15]([H:39])([H:40])[C:13]([H:35])([H:36])[C:14]([H:37])([H:38])[N:17]3[C:10]([C:8]([C:9]([C:11]3([H:30])[H:31])([H:26])[H:27])([H:24])[H:25])([H:28])[H:29])[H:22]
[H:20][c:1]1[c:3]([n:16][c:7]([c:5]2[c:6]1[O:19][C:4](=[C:2]2[H:21])[H:23])[N@@:17]([C:12]([H:32])([H:33])[H:34])[C:14]([H:37])([H:38])[C:13]([H:35])([H:36])[C:15]([H:39])([H:40])[N+:18]3([C:10]([C:8]([C:9]([C:11]3([H:30])[H:31])([H:26])[H:27])([H:24])[H:25])([H:28])[H:29])[H:41])[H:22]
[H:11][c:1]1[c:2]([c:4]([n:9][c:5]([c:3]1[H:13])[N@:10]([C:7]([H:18])([H:19])[H:20])[C:8]([H:21])([H:22])[C:6]([H:15])([H:16])[H:17])[H:14])[H:12]
[H:16][c:1]1[c:3]([c:7]([c:11]([c:8]([c:4]1[H:19])[H:23])[C:14]([H:29])([H:30])[N@:15]([c:12]2[c:9]([c:5]([c:2]([c:6]([c:10]2[H:25])[H:21])[H:17])[H:20])[H:24])[C:13]([H:26])([H:27])[H:28])[H:22])[H:18]
I tried adjusting some of these settings to no avail. I tried slapping some zeros on things (SetEnergyWindow(1500.0)
, SetRMSThreshold(100.0)
), setting both of these options to zero, and even commenting out all of the omega.*
calls hoping to get the default values. It appears Omega just returns a bool saying whether or not it succeeded, so there's not much more that can be done in the toolkit.
I don't actually know what the warning means, or how severe it is in OpenEye's jargon
Warning: : Failed to build structure from CT
except that CT is the connection table
~CT probably means "carbon, tetrahedral" or something like that. I see it a lot in atom-typing-land.~ (Whoops, I spoke too quickly. Disregard this part but the rest of the comment stands)
Glancing at this quickly I don't see anything hugely obvious. My remaining possibilities are roughly 50% this being a representation/molecule sanitization issue with us that OE has started policing (maybe we're doing something bad in to_openeye
), and 50% this being a regression in OE.
If we can make a reproducing case with an un-mapped version of one of these SMILES using pure OpenEye code, then it'll both be clear that it's a regression on their side, and we'll have a repro example we can immediately send to them. I don't have time to do this today but if either of you do, this is the route I'd recommend for debugging!
Oh, actually, I take back the CT thing. It probably does mean connection table in this context.
Here are the un-mapped SMILES. I'll try to reproduce with just OpenEye code.
[H]C1(C2=C(N=C(N2C([H])([H])[H])N3C(C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])([H])[H])[N@](C([N@]1C([H])([H])[H])([H])[H])C([H])([H])[H])[H]
[H]C1(C2=C(N=C(N2C([H])([H])[H])N3C(C(N(C(C3([H])[H])([H])[H])C([H])([H])[H])([H])[H])([H])[H])[N@](C(N1[H])([H])[H])C([H])([H])[H])[H]
[H]c1c(nc(c(n1)[N@]2C([C@@]([C@](C2([H])[H])(C([H])([H])[H])O[H])([H])C([H])([H])[H])([H])[H])N(C([H])([H])[H])C([H])([H])[H])[H]
[H]c1c(c(c(nc1C([H])([H])[H])[N@@]2C([C@@]3(C(=O)N(C(C(C3([H])[H])([H])[H])([H])[H])C([H])([H])C4(C(C4([H])[H])([H])[H])[H])C(C2([H])[H])([H])[H])([H])[H])C#N)C([H])([H])[H]
[H]c1nc(c2c(n1)ON=C2C([H])([H])[H])[N@](C([H])([H])[H])C([H])([H])C3=C(ON=C3C([H])([H])[H])C([H])([H])[H]
[H]c1c(c(nc(c1[H])[N@]2C([C@]3(C(=O)N(C(C(C3([H])[H])([H])[H])([H])[H])C([H])([H])[H])C(C2([H])[H])([H])[H])([H])[H])C(F)(F)F)[H]
[H]c1c(c(nc(c1C([H])([H])[H])C([H])([H])[H])[N@@]2C([C@@]3(C(=O)N(C(C(C3([H])[H])([H])[H])([H])[H])C([H])([H])C4(C(C4([H])[H])([H])[H])[H])C(C2([H])[H])([H])[H])([H])[H])C#N
[H]c1c(c(c(c(c1[H])[H])[N@@](C([H])([H])[H])C([H])([H])c2c(c(c(c(c2[H])[H])[C@]3([C@@](C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])([H])C(=O)O[H])[H])[H])[H])[H])[H]
[H]c1c(c(c(c(c1[H])[H])[N@@](C([H])([H])[H])C([H])([H])c2c(c(c(c(c2[H])[H])[C@]3([C@@](C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])([H])C(=O)[O-])[H])[H])[H])[H])[H]
[H]c1c(nc(c2c1OC(=C2[H])[H])[N@@](C([H])([H])[H])C([H])([H])C([H])([H])C([H])([H])N3C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])[H]
[H]c1c(nc(c2c1OC(=C2[H])[H])[N@@](C([H])([H])[H])C([H])([H])C([H])([H])C([H])([H])[N+]3(C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])[H])[H]
[H]c1c(c(nc(c1[H])[N@](C([H])([H])[H])C([H])([H])C([H])([H])[H])[H])[H]
[H]c1c(c(c(c(c1[H])[H])C([H])([H])[N@](c2c(c(c(c(c2[H])[H])[H])[H])[H])C([H])([H])[H])[H])[H]
This fails with 2023.1.0 and works with 2022.1.1:
from openeye import oechem, oeomega
smiles = "[H]C1(C2=C(N=C(N2C([H])([H])[H])N3C(C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])([H])[H])[N@](C([N@]1C([H])([H])[H])([H])[H])C([H])([H])[H])[H]"
# steps taken from
# https://docs.eyesopen.com/toolkits/python/oechemtk/molctordtor.html#construction-from-smiles
mol = oechem.OEMol()
assert oechem.OESmilesToMol(mol, smiles)
omega = oeomega.OEOmega()
assert omega(mol)
And checking all of them with the SMILES pasted into smiles.dat
:
from openeye import oechem, oeomega
win = 0
with open('smiles.dat') as inp:
for smiles in inp:
mol = oechem.OEMol()
assert oechem.OESmilesToMol(mol, smiles[:-1])
omega = oeomega.OEOmega()
win += omega(mol)
print(win)
prints 13 for 2022.1.1 and 0 for 2023.1.0.
It looks like each molecule contains a chiral nitrogen bound to an aromatic ring. Most of these are bound to the carbon between nitrogens in pyrimidine, but this one doesn't have that
C1(=C(C(=C(C(=C1[H])[H])N(C([H])([H])[H])C([H])([H])C2=C(C(=C(C(=C2[H])[H])[C@H]3[C@H](C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])C(=O)O[H])[H])[H])[H])[H])[H]
I'd expect much more than 13 structures (of a few thousand?) to have a nitrogen near a ring, at least with my experience in these datasets I get the impression that aromatic rings are ubiquitous and chiral nitrogens are common
I'm still running the benchmark to see if this is the root cause of the differences I've observed between the original Sage 2.1.0 fit and my attempt to reproduce it. I expected there to be more than 13 molecules too, but those are the only records I found in the opt-set-for-fitting-2.1.0.json in the Sage-2.1.0 repo. There are 5580 records therein, but only 1701 when converting the list of record.cmiles
entries to a set. So it's something like 13/1701 I guess.
This must not account for all of the difference I've seen because I have 23 fewer opt-geo batches than the Sage 2.1.0 repo, which should contain ~600 structures total. But this conformer generation error definitely prevented me from running ForceBalance on the Sage 2.1.0 inputs with my environment.
chiral nitrogens are common
I wonder if this is the issue. "Which nitrogens are chiral?" is an open question and different cheminformatics toolkits don't always agree. In particular, some recent omega releasenotes say:
Conformer generation with OEOmega.Build or OEMolBuilder.Build now fails correctly when specified stereo signatures on a molecule cannot be honored. The existing behavior, to carry on even if stereo signatures could not be honored, can be obtained by setting IgnoreStereo to true.
So maybe it's that Omega is getting strict here - for example N@
and N@@
must now be honored in the output, but Omega's internal FF/geometry generator for some reason always makes those planar, so it can never match either stereochemistry.
The releasenotes mention that we can set the (ominously-named) IgnoreStereo
to True to recover the old behavior. Could you give that a shot?
I'm not sure if I'm using the Builder
correctly, but this does fail to run on 2022.1.1 for AttributeError: 'OEMolBuilderOptions' object has no attribute 'SetIgnoreStereo'
, so that change must have occurred between the two.
from openeye import oechem, oeomega
win = 0
with open("smiles.dat") as inp:
for smiles in inp:
mol = oechem.OEMol()
assert oechem.OESmilesToMol(mol, smiles[:-1])
builder = oeomega.OEMolBuilder()
options = oeomega.OEMolBuilderOptions()
options.SetIgnoreStereo(True)
builder.SetOptions(options)
assert builder.Build(mol) == oeomega.OEOmegaReturnCode_Success
omega = oeomega.OEOmega()
assert omega.Build(mol) == oeomega.OEOmegaReturnCode_FailedCTBuild
win += omega(mol)
print(win)
Edit: This fails for every conformer in the 2023 version too. Sorry I forgot to mention that originally! I brought up the 2022 performance to show that I was using new code (SetIgnoreStereo
) but forgot to say that it doesn't change the 2023 result (still fails to generate conformers, as seen in the return code assertion).
I have also now tried the latest release
mamba create -n openeye-2023.1.1 -c openeye 'openeye-toolkits=2023.1.1' openff-qcsubmit
with the same output:
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
Warning: : Failed to build structure from CT
0
I wonder if this is an OpenEye support issue; if the OpenEye toolkit previously could generate conformers for these but now it cannot, is that on their end?
I updated my last comment with some additional details and just sent an email to OpenEye support!
Should I close this? Based on the response from OpenEye support, it sounds like a known/expected change from their perspective. And unless we want to propagate the OEMolBuilderOptions
or OEOmegaOptions
in their code examples, it's not really a problem with our toolkit.
I'm looking through some of my opened issues today and thought I would close some if they are no longer relevant. I'll leave it if it's useful to have around, though.
I think so; if we're not likely to do anything about it and if it's too esoteric to document here, there is no reason to keep this open
@amcisaac just asked whether we should be using OE 2022 for our work since it can charge more molecules. This is a complex topic, and while previous discussion talked about the CAUSE of the issue (and concluded that it's intentional from OE's end), it didn't resolve what we should DO about it (like, does this have a big enough impact on FF fitting/application that we should implement a workaround)
I think the discussion above this can be recapped as
SetIgnoreStereo(True)
) doesn't work (?)This may have been discussed elsewhere, but I wasn't able to easily find it, so I'd like to confirm on this issue:
SetIgnoreStereo(True)
really not work to recover the old behavior? (I think @ntBre's original comment indicates this doesn't work)Kinda a messy science/infra crossover question here. I'd be interested in thoughts from @lilyminium @amcisaac @ntBre on the science impact. Could be as simple as "doesn't really affect fitting, not a big deal", but it'd just be good to have that written down somewhere.
Thanks @j-wags for the summary. My current thoughts are below where I think I've talked myself into upgrading, but very keen to hear @amcisaac and @ntBre's takes on it too.
Does SetIgnoreStereo(True) really not work to recover the old behavior? (I think @ntBre's original comment indicates this doesn't work)
When OpenEye support got back to us they also suggested enumerating nitrogens -- does that recover the old behaviour in this case as well?
Does this affect FF fitting significantly? If so, are the changes to FF fitting clearly good/bad?
Should we have our infrastructure provide the old behavior?
I don't have very strong feelings about this, but I would lean toward just updating. OpenEye support confirmed that it was intended behavior, and we can't really pin to 2022 forever. This affects a small number of molecules (that sound pretty weird anyway), so I don't think the net effect on fitting is substantial if even noticeable. I think I opened this when I was still debugging why I couldn't reproduce the Sage 2.1 fit and thought this might be a contributing factor, but that turned out to be a different issue.
Could one way to do this be to fall back on RDKit conformer generation when OpenEye refuses?
I think this would be a tough solution since the current design philosophy is that the OpenFF Toolkit can run with RDKit OR OpenEye, so adding some dependence (even optional) between them will increase complexity and diminish reproducibility.
In my opinion I think we have to upgrade, as others have said we can't pin to a 2022 version forever, and I suspect people may not know to pin to it or may forget why it is pinned and try to unpin it to resolve other conda incompatibilities (like me 😅 which is why I re-opened it ).
I don't have a good sense about whether these molecules are unique or how it would affect training. If we are worried that the upgrade would be eliminating a large chunk of chemistry, I would be happy to assess how similar these molecules are to others in our training set. If it would be eliminating some unique molecules that we do want to keep, we could always generate new data for those functional groups that don't have the chiral N problem (unless the chiral N is inherent to those chemistries, I suppose...).
Some good news - I've tested @ntBre's 13 unique molecules in the latest version of OpenEye and they appear to work.
So I think we just shouldn't use OpenEye Toolkits 2023 for fitting work, and that this issue can be closed :-)
Great, totally happy to upgrade -- thanks @j-wags for looking into it!
I'm having (maybe) the same issue for the SMILES [H][C@]12[C@](C1([H])C(=O)N3C(C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])([H])[H])([C@]4(C([C@@]2(C(C4([H])[H])([H])[H])[H])([H])[H])[H])[H]
with openeye-toolkits 2024.1.1
that didn't have any errors with 2022.1
. I don't see a chiral nitrogen in this SMILES string, so it's possible it's a different conformer generation issue. This is from the industry benchmark dataset. Just giving a heads up that anyone doing benchmarking using the Sage 2.2.0 dataset may want to keep pinning to 2022 Open Eye. In the mean time, I'll do another filter and see how many other molecules are affected there (this is just the one that caused my run to crash).
ValueError: No registered toolkits can provide the capability "assign_partial_charges" for args "()" and kwargs "{'molecule': Molecule with name '' and SMILES '[H][C@]12[C@](C1([H])C(=O)N3C(C(C(C(C3([H])[H])([H])[H])([H])[H])([H])[H])([H])[H])([C@]4(C([C@@]2(C(C4([H])[H])([H])[H])[H])([H])[H])[H])[H]', 'partial_charge_method': 'am1bccelf10', 'use_conformers': None, 'strict_n_conformers': False, 'normalize_partial_charges': True, '_cls': <class 'openff.toolkit.topology.molecule.Molecule'>}"
Available toolkits are: [ToolkitWrapper around OpenEye Toolkit version 2024.1.1, ToolkitWrapper around The RDKit version 2024.03.5, ToolkitWrapper around AmberTools version 23.6, ToolkitWrapper around Built-in Toolkit version None]
ToolkitWrapper around OpenEye Toolkit version 2024.1.1 <class 'openff.toolkit.utils.exceptions.ConformerGenerationError'> : OpenEye Omega conformer generation failed
ToolkitWrapper around The RDKit version 2024.03.5 <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : partial_charge_method 'am1bccelf10' is not available from RDKitToolkitWrapper. Available charge methods are {'mmff94': {}, 'gasteiger': {}}
ToolkitWrapper around AmberTools version 23.6 <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : partial_charge_method 'am1bccelf10' is not available from AmberToolsToolkitWrapper. Available charge methods are {'am1bcc': {'antechamber_keyword': 'bcc', 'min_confs': 1, 'max_confs': 1, 'rec_confs': 1}, 'am1-mulliken': {'antechamber_keyword': 'mul', 'min_confs': 1, 'max_confs': 1, 'rec_confs': 1}, 'gasteiger': {'antechamber_keyword': 'gas', 'min_confs': 0, 'max_confs': 0, 'rec_confs': 0}}
ToolkitWrapper around Built-in Toolkit version None <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : Partial charge method "am1bccelf10"" is not supported by the Built-in toolkit. Available charge methods are {'zeros': {'rec_confs': 0, 'min_confs': 0, 'max_confs': 0}, 'formal_charge': {'rec_confs': 0, 'min_confs': 0, 'max_confs': 0}}
After re-filtering the benchmark dataset, it looks like just this one molecule was affected, which has 6 conformers. The mapped SMILES that is output by our charge filtering script is [C:1]1([H:17])([H:18])[C:2]([H:19])([H:20])[C:6]([H:27])([H:28])[N:15]([C:14]([C@:13]2([H:37])[C@@:11]3([H:35])[C@@:9]4([H:33])[C:4]([H:23])([H:24])[C:5]([H:25])([H:26])[C@:10]([H:34])([C:8]4([H:31])[H:32])[C@:12]32[H:36])=[O:16])[C:7]([H:29])([H:30])[C:3]1([H:21])[H:22]
which appears to generate an identical visualization as the above SMILES string. My intuition would be that it's not a big deal to upgrade based on this, as it's just one molecule out of ~10,000 and doesn't appear to have any particularly unique chemistry, but we do need to re-filter our benchmark dataset.
Describe the bug The newer version of
openeye-toolkits
fails to assign partial charges where the previous version did so successfully. The core of the attached Python script iswhere
mol
is anopenff.toolkit.Molecule
.To Reproduce Create two conda environments:
Run the included example.py script with both environments
Output The first one will print
52 succeeded, 0 failed
, while the second will print0 succeeded, 52 failed
.Additional context For an even smaller test set, the 2 SMILES below fail to charge in the new version but charge in the old version:
And these two charge successfully in both cases:
repro.zip