Open utterances-bot opened 2 years ago
Thank you, I have been looking for a solution for Markush enumerations! Does this new implementation allow for multiple attachment points? For example, when the R group is a linker?
@hollisullivan27, I don't understand your question; both examples in the blog post have a fragment which has multiple attachment points. Can you be specific, ideally including molecules, what you want to do?
Hello,
I'm trying to replicate the steps in this tutorial with my own molecule and ran into an error. The molecule has two R groups attached to the same atom (Nitrogen). RGroupDecompose puts both into one 'R group' (R1), but molzip can't put them back together. I'd appreciate any insight on this. Thank you!
from rdkit.Chem import AllChem, RWMol, molzip, rdRGroupDecomposition as rgd
mol = AllChem.MolFromSmiles("CC1(C)CC(N(CCc2ccccc2)Cc2ccccc2)CCO1")
core = AllChem.MolFromSmiles("NCc1ccccc1")
rgs = rgd.RGroupDecompose([core], [mol])
core_fragment = rgs[0][0]['Core']
r1 = rgs[0][0]['R1']
product = RWMol(core_fragment)
product.InsertMol(r1)
molzip(product)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
testing.ipynb Cell 11' in <cell line: 10>()
[8](vscode-notebook-cell://testing.ipynb#ch0000010vscode-remote?line=7) product = Chem.RWMol(core_fragment)
[9](vscode-notebook-cell://testing.ipynb#ch0000010vscode-remote?line=8) product.InsertMol(r1)
---> [10](vscode-notebook-cell://testing.ipynb#ch0000010vscode-remote?line=9) Chem.molzip(product)
RuntimeError: Invariant Violation
molzip: bond info already exists for end atom with label:1
Violation occurred on line 907 in file Code/GraphMol/ChemTransforms/MolFragmenter.cpp
Failed Expression: !bond.b
RDKIT: 2021.09.5
BOOST: 1_67
The molecule:
Hi @xescape. This is exactly why the section of the blog post which says: "Remove any R groups which have more than one dummy atom. This happens if an R group is attached to the core at multiple points and it may mess up the rest of the analysis." is there. Molzip just doesn't support this at the moment.
The easiest solution is to add explicit dummy atoms to your core on atoms which can have more than one substituent:
core = AllChem.MolFromSmiles("[*:1]N([*:2])Cc1ccccc1")
rgs = rgd.RGroupDecompose([core], [mol])
core_fragment = rgs[0][0]['Core']
r1 = rgs[0][0]['R1']
r2 = rgs[0][0]['R2']
product = RWMol(core_fragment)
product.InsertMol(r1)
product.InsertMol(r2)
p = molzip(product)
print(Chem.MolToSmiles(p))
Ah, sorry I missed that. Thanks for your help!
On Thu, Mar 24, 2022 at 12:33 AM Greg Landrum @.***> wrote:
Hi @xescape https://github.com/xescape. This is exactly why the section of the blog post which says: "Remove any R groups which have more than one dummy atom. This happens if an R group is attached to the core at multiple points and it may mess up the rest of the analysis." is there. Molzip just doesn't support this at the moment.
The easiest solution is to add explicit dummy atoms to your core on atoms which can have more than one substituent:
core = AllChem.MolFromSmiles("[:1]N([:2])Cc1ccccc1")
rgs = rgd.RGroupDecompose([core], [mol]) core_fragment = rgs[0][0]['Core'] r1 = rgs[0][0]['R1'] r2 = rgs[0][0]['R2'] product = RWMol(core_fragment) product.InsertMol(r1) product.InsertMol(r2) p = molzip(product) print(Chem.MolToSmiles(p))
— Reply to this email directly, view it on GitHub https://github.com/greglandrum/rdkit-blog/issues/14#issuecomment-1077058859, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZXVBCCSDTPHYYQZ3L5Q3DVBPWAFANCNFSM5Q2PVHWA . You are receiving this because you were mentioned.Message ID: @.***>
From my experiments and poking about in the code, it appears that molzip uses the atom map numbers on the dummy atoms by default, not the isotope numbers as you say above. That's certainly consistent with the [:1] notation in the SMILES you show. Isotopes would show as [1]. Could you amend the blog, as it's the first hit when I searched for "rdkit molzip" and it sent me off on a bit of a wild goose chase. FragmentOnBonds uses isotopes to label the dummy atoms at the fragmentation points, which means the two don't play well together, but that's a separate issue.
Running this in PyCharm I get an error
conf = core.GetConformer()
ValueError: Bad Conformer Id
rdkit version = 2022.03.5 installed from conda
R-group decomposition and molzip | RDKit blog
Generating molecules from all possible combinations of R groups
https://greglandrum.github.io/rdkit-blog/tutorial/rgd/2022/03/14/rgd-and-molzip.html