forlilab / Meeko

Interfacing RDKit and AutoDock
GNU Lesser General Public License v2.1
187 stars 46 forks source link

Covalent Docking #146

Open kunal19977835 opened 1 month ago

kunal19977835 commented 1 month ago

I performed covalent docking using meeko and autodock, When I received the results the dlg file seems to be missing some atoms. I’m not able to create a file for md simulation because it shows incorrect bonds and atoms

diogomart commented 1 month ago

can you post the files?

diogomart commented 1 month ago

and list the missing atoms and incorrect bonds

kunal19977835 commented 1 month ago

test.zip i've uploaded the dlg file, along with original protein structure (SOD) and the rigid protein and flex protein files. not able to combine as there are some atoms missing in the rigid protein file and when I combine using original protein file (SOD) in pdb format, it shows multiple bonds in chimera

diogomart commented 1 month ago

In a docking with flexible sidechains, the protein is split into two PDBQT files, one for the static atoms (suffixed with _rigid) and another for the movable sidechain atoms (suffixed with _flex). Unfortunately, we don't have a script to rebuild an entire protein updated with the docked positions of the sidechain, but that will be implemented soon and released in v0.6. Meanwhile, if you load the rigid part and the docked SDFs into a viewer, it's likely that the bonds between the flexible sidechain and the rest of the protein are missing because they are in different objects.

kunal19977835 commented 1 month ago

Is there any other way to work around this problem?

diogomart commented 1 month ago

Two laborious ways come to mind:

kunal19977835 commented 1 month ago

test1.zip I already tried that, but when i uploaded this file in schrodinger maestro, I could see multiple bonds which showed errors. Can you please look at it?

rwxayheee commented 1 month ago

Hi @kunal19977835 The hydrogen named 1HC is too close to the carbonyl oxygen of Val146. This is because the hydrogen is nonpolar (therefore treated implicitly) in AutoDock. When you import this PDB file, schrodinger maestro attempts to guess bonds based on proximity. To resolve the conflicts, you could edit the bond manually, then run a minimization in maestro

kunal19977835 commented 4 weeks ago

IMG_4610 ![Uploading IMG_4608.jpeg…] ![Uploading IMG_4609.jpeg…] I adjusted bond length as you asked, but still got these errors

kunal19977835 commented 4 weeks ago

IMG_4609 IMG_4608

diogomart commented 4 weeks ago

Looks like Maestro is creating non existing bonds based on atom names. For example, the bond between backbone N and the H on the carboxylic acid is probably being hallucinated because both of these hydrogens are named "H".

rwxayheee commented 4 weeks ago

Hi @kunal19977835 Could you tell us a little bit more about how you generated cys.pdb?

When you import this structure in maestro, you will see the following problems with atom names:

WARNING mmpdb_build_altloc_db_for_atom: Atom records 78 and 79 in residue 6 (CYS ), chain 'A' have same name ' C  '

WARNING mmpdb_build_altloc_db_for_atom: Atom records 78 and 80 in residue 6 (CYS ), chain 'A' have same name ' C  '

WARNING mmpdb_build_altloc_db_for_atom: Atom records 78 and 81 in residue 6 (CYS ), chain 'A' have same name ' C  '

WARNING mmpdb_build_altloc_db_for_atom: Atom records 82 and 83 in residue 6 (CYS ), chain 'A' have same name ' O  '

WARNING mmpdb_build_altloc_db_for_atom: Atom records 82 and 84 in residue 6 (CYS ), chain 'A' have same name ' O  '

WARNING mmpdb_build_altloc_db_for_atom: Atom records 86 and 87 in residue 6 (CYS ), chain 'A' have same name ' S  '

WARNING mmpdb_build_altloc_db_for_atom: Atom records 88 and 89 in residue 6 (CYS ), chain 'A' have same name ' H  '

  Imported 1 structure

You can also use Protein Preparation > View Problems to view these problems. The unexpected names are causing atom typing issue for maestro, because CYS is a standard residue and has a certain template in maestro. The quick way to fix it is to rename the atoms (maybe using other CYS in your system as a template, and definitely isolate the covalent conjugate as a new residue).

kunal19977835 commented 4 weeks ago

How do I do that?

kunal19977835 commented 4 weeks ago

I used meeko to align homocysteine as a ligand and bound it with 6th cysteine residue, then used covalent docking scripts from autodock. Rest the problem I’m facing I’ve reported earlier in this tread

diogomart commented 4 weeks ago

@kunal19977835 since this is an issue with Maestro, you could contact their support team.

rwxayheee commented 4 weeks ago

@kunal19977835 Do you think (in any import/export steps) meeko might have dropped the atom names?

kunal19977835 commented 4 weeks ago

I don’t know, could be covalent bond scripts from autodock. Can you suggest me some work around to fix these? Like which software to use and how

rwxayheee commented 4 weeks ago

Hi @kunal19977835 It's easy to manually fix the atom names, if you don't have too many atom names missing. This requires some basic knowledge on the standard atom names and the PDB format. To do this, just open the file in a text editor, then change the atom names. From this:

ATOM     77  N   CYS A   6      20.631  75.490  11.558  1.00  0.00           N
ATOM     78  C   CYS A   6      18.860  75.944   9.976  1.00  0.00           C
ATOM     79  C   CYS A   6      17.860  73.952  11.545  1.00  0.00           C
ATOM     80  C   CYS A   6      17.722  72.545  10.980  1.00  0.00           C
ATOM     81  C   CYS A   6      19.032  71.814  11.122  1.00  0.00           C
ATOM     82  O   CYS A   6      19.320  76.992   9.526  1.00  0.00           O
ATOM     83  O   CYS A   6      20.048  72.277  11.611  1.00  0.00           O
ATOM     84  O   CYS A   6      18.978  70.557  10.627  1.00  0.00           O
ATOM     85  CB  CYS A   6      18.589  76.525  12.438  1.00  0.00           C
ATOM     86  S   CYS A   6      16.793  76.684  12.191  1.00  0.00           S
ATOM     87  S   CYS A   6      16.345  74.915  11.236  1.00  0.00           S
ATOM     88  H   CYS A   6      21.145  76.215  11.072  1.00  0.00           H
ATOM     89  H   CYS A   6      18.108  70.311  10.260  1.00  0.00           H
ATOM     90 1HC  CYS A   6      18.034  73.892  12.619  1.00  0.00           H
ATOM     91 2HC  CYS A   6      18.706  74.449  11.071  1.00  0.00           H
ATOM     92 3HC  CYS A   6      16.946  72.009  11.527  1.00  0.00           H
ATOM     93 4HC  CYS A   6      17.450  72.602   9.926  1.00  0.00           H

To this:

ATOM     77  N   CYS A   6      20.631  75.490  11.558  1.00  0.00           N
ATOM     78  C   CYS A   6      18.860  75.944   9.976  1.00  0.00           C
ATOM     79  C1  LIG A 152      17.860  73.952  11.545  1.00  0.00           C
ATOM     80  C2  LIG A 152      17.722  72.545  10.980  1.00  0.00           C
ATOM     81  C3  LIG A 152      19.032  71.814  11.122  1.00  0.00           C
ATOM     82  O   LIG A 152      19.320  76.992   9.526  1.00  0.00           O
ATOM     83  O1  LIG A 152      20.048  72.277  11.611  1.00  0.00           O
ATOM     84  O2  LIG A 152      18.978  70.557  10.627  1.00  0.00           O
ATOM     85  CB  CYS A   6      18.589  76.525  12.438  1.00  0.00           C
ATOM     86  SG  CYS A   6      16.793  76.684  12.191  1.00  0.00           S
ATOM     87  S1  LIG A 152      16.345  74.915  11.236  1.00  0.00           S
ATOM     88  H   CYS A   6      21.145  76.215  11.072  1.00  0.00           H
ATOM     89 1HO2 LIG A 152      18.108  70.311  10.260  1.00  0.00           H
ATOM     90 1HC1 LIG A 152      18.034  73.892  12.619  1.00  0.00           H
ATOM     91 2HC1 LIG A 152      18.706  74.449  11.071  1.00  0.00           H
ATOM     92 1HC2 LIG A 152      16.946  72.009  11.527  1.00  0.00           H
ATOM     93 2HC2 LIG A 152      17.450  72.602   9.926  1.00  0.00           H

cys_mod.pdb.txt You can change ATOM to HETATM for LIG atoms if you want, but it doesn't matter to maestro. Now, if you load this structure in maestro, you will not get the same errors. However, there are still bonds to correct, hydrogens to add (for CA and CB), and you may do it in maestro (you can look it up from the maestro manual or a basic tutorial). Follow the basic procedure of "Protein Preparation Wizard" in maestro would be a good start.

Many software (Schrodinger, Amber) still use atom names for parameterization. It's important to have the correct atom names and residue names as required by the software. If your docking input were generated from Meeko and no further changes were done, I do think it's some part of Meeko that unintentionally dropped the atom names. Based on the DLG you posted, the only sidechain atom name kept for CYS 6 is CB and that's indeed very confusing the reactive atom's original name (SG) is overridden and the covalent ligand may have redundant atom names. Redundant atom names might cause issues when they're in the same residue. We will look into that in the near future. Thanks for the reporting

kunal19977835 commented 4 weeks ago

Thank you so much, I’ll try and get back to you if this works

kunal19977835 commented 1 week ago

The structure is fixed thank you, can you guide me through steps to perform md simulation of this covalently docked ligand using gromacs. i am facing certain errors

diogomart commented 1 week ago

That's an excellent question, we have been working on that and will publish tutorials in the near future, but they are not ready yet. For sure we will cover openmm, and I think it should be possible to convert the input to gromacs.

kunal19977835 commented 1 week ago

Is there a work around meanwhile I can work on, I have read that certain people are suggesting adding modified residues in the force fields along with defining sulphide bonds.But since I am. Novice i’m facing issues writing.rtp files and making necessary changes

rwxayheee commented 1 week ago

Hello @kunal19977835

Just a quick note: You don't have to use this docked conformation for parameter generation, or you must do some level of optimization before using it for simulation. The docked conformation appears to be very strained, and this could cause problems if your parameter derivation protocol uses the input geometry for atom typing or parameter verification. There are different ways to set up system with covalent ligands depending on what force field you want. If you want to use the CHARMM36 force field for your system, consider running a CGenFF (https://cgenff.silcsbio.com/) job for the cys-ligand conjugate (or Desaminozystin). Then compare the stream file with the expected .rpt format GROMACS needs. It's best to ask and discuss on the GROMACS forum for technical details: https://gromacs.bioexcel.eu/

Another thing I noticed in the dlg file (which might partly explain the strained conformation..) is:

INPUT FLEXIBLE RESIDUES PDBQT FILE:
___________________________________

INPUT-FLEXRES-PDBQT: BEGIN_RES CYS A   6
INPUT-FLEXRES-PDBQT: REMARK  2 active torsions:
INPUT-FLEXRES-PDBQT: REMARK  status: ('A' for Active; 'I' for Inactive)
INPUT-FLEXRES-PDBQT: REMARK    1  A    between atoms: CB   and  S   
INPUT-FLEXRES-PDBQT: REMARK    2  A    between atoms: S    and  S   
INPUT-FLEXRES-PDBQT: ROOT
INPUT-FLEXRES-PDBQT: ATOM      1  CB  CYS A   6      18.589  76.525  12.438  1.00  0.00     0.069 C 
INPUT-FLEXRES-PDBQT: ENDROOT
INPUT-FLEXRES-PDBQT: BRANCH   1   2
INPUT-FLEXRES-PDBQT: ATOM      2  S   CYS A   6      16.793  76.684  12.191  1.00  0.00    -0.082 SA
INPUT-FLEXRES-PDBQT: BRANCH   2   3
INPUT-FLEXRES-PDBQT: ATOM      3  S   CYS A   6      16.275  77.972  13.713  1.00  0.00    -0.082 SA
INPUT-FLEXRES-PDBQT: ATOM      4  C   CYS A   6      16.980  79.538  13.107  1.00  0.00     0.080 C 
INPUT-FLEXRES-PDBQT: ATOM      5  C   CYS A   6      16.646  80.677  14.060  1.00  0.00     0.107 C 
INPUT-FLEXRES-PDBQT: ATOM      6  C   CYS A   6      17.354  81.931  13.615  1.00  0.00     0.153 C 
INPUT-FLEXRES-PDBQT: ATOM      7  O   CYS A   6      18.085  82.021  12.644  1.00  0.00    -0.649 OA
INPUT-FLEXRES-PDBQT: ATOM      8  O   CYS A   6      17.105  82.981  14.430  1.00  0.00    -0.776 OA
INPUT-FLEXRES-PDBQT: ATOM      9  H   CYS A   6      16.508  82.775  15.173  1.00  0.00     0.167 HD
INPUT-FLEXRES-PDBQT: ENDBRANCH   2   3
INPUT-FLEXRES-PDBQT: ENDBRANCH   1   2
INPUT-FLEXRES-PDBQT: END_RES CYS A   6

The 3-Mercaptopropionic acid part is completely flat in 3D and does not have flexible torsions. This may or may not be something you want, and might cause subsequent problems in parameter derivation