forlilab / Meeko

Interfacing RDKit and AutoDock
GNU Lesser General Public License v2.1
171 stars 41 forks source link

Question about ring breaking rules #72

Closed rwxayheee closed 9 months ago

rwxayheee commented 9 months ago

Hi Meeko dev team, I have a question about the ring breaking rules for ligands with flexible macrocycles. Is it true that, in the current version, the glue atoms will never be placed on aromatic carbons, but it is possible on other SP2 carbons?

Please consider the following example - Dihydroazepine is a common substructure in pharmaceuticals. One basic isomer is: 2,5-Dihydroazepine

And its dibenzo derivative, which is also common in CNS active drugs, is: 6,11-Dihydro-5H-dibenzo[b,E]azepine dihydroazepines

By default, Meeko breaks the seven-membered ring in 2,5-Dihydroazepine and adds a glue pair, but not for the dibenzo derivative. So I wonder if the ring breaking only occurs at the nonaromatic atoms. Also, does it have to be a carbon?

2,5-Dihydroazepine.pdbqt.txt dibenzo.pdbqt.txt

I was comparing ligands with different arrangements and types of fused rings then I noticed the way different kinds of polycyclic structures are treated. Sorry for opening issues for questions (and on weekends..). This is really not an issue and not a very urgent question. But I look forward to learning more about the ring breaking rules in Meeko.

Thanks a lot for your time and kind advice!

rwxayheee commented 9 months ago

I took a peak at macrocycle.py where i think _score_bond is for defining breakable and unbreakable bonds. Currently only carbon-carbon bonds are considered and both of them must be nonaromatic, according to:

        if not self.setup.get_element(atom_idx1) == 6 or self.setup.is_aromatic(atom_idx1):
            return -1
        if not self.setup.get_element(atom_idx2) == 6 or self.setup.is_aromatic(atom_idx2):
            return -1

If anyone is interested in breaking heteroatom-containing bonds (like C-N, C-O) or half-aromatic bonds in rings, additional penalization rules will be needed to handle such cases.

diogomart commented 9 months ago

Hello,

Ring breaking occurs only between non-aromatic carbons. This is just to avoid a large number of atom types. Your molecule is a great example of the glue atom types being insufficient.

It might be possible to have a quick workaround (which is a bit of an ugly hack). It consists of changing the atom type of one of the aromatic carbons from A to C. The scores would be a tiny bit different. Is this something you'd be interested in?

rwxayheee commented 9 months ago

Hi @diogomart, thanks for your kind reply. Yes, I wanted to be able to sample conformations of such seven-membered ring, by twisting the C-N torsion and torsions composed of aromatic and nonaromatic carbons.

It seems doable by allowing these cases in _score_bond. If I get a chance I can post the codes that enable breaking at such bonds, but I should do more calculations on such molecules (maybe looking at the low-frequency motions or the torsional barriers..) and try to make some fair penalization function for each case. In the presented case, I don't know which is more twistable, C-N or the half-aromatic bond.

I have a variety of such systems, with different type, number, position of heteroatoms, branches and double bonds. Actually, I was trying to do some scaffold hopping, then I realized the inconsistencies in the placements of glue atoms and N_rot. I completely understand that is very difficult to score the breakability of bonds in these compounds in a fair and unversal way. At the moment, I scored all these ligands as if rigid (but flexible macrocycles were allowed in docking, if possible). Does this sound like a ratioinal idea to you?

This is only a temporary fix to address mistakes I made & bias I left in some old work. Once we get that out, I can contribute more on this hopefully in the near future..

rwxayheee commented 9 months ago

Hello,

Ring breaking occurs only between non-aromatic carbons. This is just to avoid a large number of atom types. Your molecule is a great example of the glue atom types being insufficient.

It might be possible to have a quick workaround (which is a bit of an ugly hack). It consists of changing the atom type of one of the aromatic carbons from A to C. The scores would be a tiny bit different. Is this something you'd be interested in?

Yes, changing the atom types as you suggested is another way we can try on our side without modifying the codes. If that doesn't change too much of the pose ranking and still produces reasonable poses, the scores can be corrected afterwards.

Thanks again for your kind advice as always!

diogomart commented 9 months ago

Modified macrocycle.py to break bonds between atoms typed "C". 1f9071df9f8bb9bfe6c4d70f3816bc2b06fa79fb

Now, for "c1ccc2c(c1)CNc1ccccc1C2", we can force one of the aromatic carbons "A" to "C" with meeko's option

--add_atom_types '{"smarts": "NccCc", "atype": "C", "IDX": [5]}'

the SMARTS is specific enough to match only once in that molecule and we are typing the fifth atom of the SMARTS as "C", enabling one of the macrocycle bonds to be broken.

To allow breaking bonds between pairs of atom types other than C-C we would have to change the code in Vina and/or AD-GPU. Typing heteroatoms as C is probably a bad idea because N and O have different vdW radii and may participate in H-bonds.

At the moment, I scored all these ligands as if rigid (but flexible macrocycles were allowed in docking, if possible). Does this sound like a ratioinal idea to you?

Yes, especially for small macrocycles, the rotatable bonds don't really rotate much.

rwxayheee commented 9 months ago

Thanks so much for taking the time into this @diogomart !! That is a very nice & efficient walkaround.. Really appreciate this. We will play around with it and let you know how things go :) Thanks for always being so supportive

diogomart commented 9 months ago

Very happy to help! Let us know how it goes :)

rwxayheee commented 9 months ago

Hi Diogo,

I managed to break the C(aryl)–C(alkyl) bonds in my compounds. If you are interested, here is how I identified such bonds: https://github.com/rwxayheee/ICHIKO/blob/develop/additional_scripts/breaker.py

Here are a few examples (all are FDA-approved drugs) with such atom type replacements:

Screenshot 2023-09-29 at 9 28 26 PM

Better and more efficient implementations would be to make changes directly to functions in macrocycle.py, but I don’t want to pollute the code. Actually, I found that the C(aryl)–C(alkyl) torsions do not rotate much in the docking calculations, even though they were made flexible.. This might have something to do with how the broken ring model works with the rest of the algorithm (ligand flexibility representation or optimization method) in Vina. I don’t really know the exact cause at the moment. But there are many ways to generate such conformers before docking.

Picture1

Overall, the default ring breaking rules work pretty well with C(alkyl)-C(alkyl) torsions. We are very happy with the current implementation, it means a lot to our current project.. Thanks again for your kind support :)

diogomart commented 9 months ago

Thank you for sharing your results and script!

I don't know for sure why some macrocycles don't interconvert, but one hypothesis is that the input coordinates are slightly distorted to distribute strain (for example, the imine dihedral might be slightly different from zero). It is possible that reducing the magnitude of the "glue" potential could help in a few (probably not all) systems, say from 50 (default) to 5 kcal/mol/A for example. This is the --weight_glue command line option or the sixth (1-indexing) value in a list passed to v.set_weights.

rwxayheee commented 9 months ago

Hi @diogomart, thanks so much for your kind advice. I was look at the original paper then I thought about similar things, but didn't know there is an option to do it! That would be really useful. Thanks again for your kind reply

It is possible that reducing the magnitude of the "glue" potential could help in a few (probably not all) systems Yes, or using some boost potential on top of it will allow generation of conformers in an early stage of docking. The magnitude will depend on the types of rings though.

The "not rotating much" C(aryl)–C(alkyl) systems are more strained than the cycles that contain C(alkyl)–C(alkyl). I think the current glue potential is able to handle rings containing C(alkyl)–C(alkyl) and it matches with the current ring breaking rules in Meeko.

The cycle in the previous C(aryl)–C(alkyl) example is probably too strained.. Maybe I don't have enough docking data for the C(aryl)–C(alkyl) systems, but there are a few successful cases! Please see the following example..

Screenshot 2023-10-02 at 3 51 39 PM