patrickbryant1 / Umol

Protein-ligand structure prediction
188 stars 17 forks source link

Target pos #9

Closed LiviaPham closed 9 months ago

LiviaPham commented 9 months ago

Hi,

I'm trying to run Umol but currently I'm stuck at the "Predict" step and don't know how to get the "target_pos $POCKET_INDICES" data. Can you help me? Thanks for your program and I hope to receive your response.

Best wishes, Livia.

patrickbryant1 commented 9 months ago

Hi, The target positions are defined as all CBs within 10Å from any ligand atom in your binding site. You therefore need to know your binding site.

If you look at the example here: https://colab.research.google.com/github/patrickbryant1/Umol/blob/master/Umol.ipynb you can see the target positions in stick format.

Hope this helps!

LiviaPham commented 9 months ago

Well, I'm extremely grateful for your instructions. Maybe my description is not good. But what I'm really stuck on is how to determine the "target_pos $POCKET_INDICES", or binding site ligand-protein what didn't research before, like your example: TARGET_POSITIONS: 51,52,54,55,56,57,58,59,60,61,62,63,65,66,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,93,94,95,96,97,98,99,100,101,102,104,105,125,128,129

Do you use a database or some other tool to finding this information that can share with me?

Have a nice day, Livia.

patrickbryant1 commented 9 months ago

Hi, This information has to be provided by you. Maybe look in the PDB for similar proteins with known ligands and take the site from there.

I realise that this field may be new to you (?). In general, a target site is predetermined for drug development (how else do you know you want to drug that site?). If the inverse problem is true and you have a drug you know binds to something but not how I recommend getting a crystal structure.

Hope this helps!

LiviaPham commented 9 months ago

Oh I understand.

I am really a newbie in this field. I have recetnly join a course and I am trying to read publication to understand it. If you have a step-by-step guide or instruction to figure out target position in this example, please help me. I am now trying to apply your approach in a new protein-ligand.

Many thanks to you. Livia.

patrickbryant1 commented 9 months ago

What is your protein?

LiviaPham commented 9 months ago

Hi, My ligand SMILES: CC1=C(Cl)C=C(NC(=O)NCC2=CC=C3C(=O)N(CC3=C2)C2CCC(=O)NC2=O)C=C1 My protein in Uniprot: Q96SW2

Thank you very much. Livia.

patrickbryant1 commented 9 months ago

Hi, Since you only have one ligand, I am not sure it is meaningful to use Umol. It is better to go to the lab. If you can't/don't want to do that I suggest perhaps focusing on another research topic as predictions will only get you so far.

Still, I provide a guide here: If you search your Uniprot ID and look at available structures you can see that this is available with a bound peptide: https://www.rcsb.org/3d-view/4M91/1

You can now download this structure and extract all CBs in the protein (chain A) that are within 10Å from the peptide (chain B). These are your target residues: 47,48,49,50,51,53,54,55,56,57,58,59,60,61,89,90,91,94,97,98,102,103,104,105,106,107 sequence: SKKENLLAEKVEQLMEWSSRRSIFRMNGDKFRKFIKAPPRNYSMIVMFTALQPQRQCSVSRQANEEYQILANSWRYSSAFSNKLFFSMVDYDEGTDVFQQLNMNSAPTFMHFPPKGRPKRADTFDLQRIGFAAEQLAKWIADRTDVHIRVFRL

Plug this into Umol and you will obtain a prediction. Note that doing research with AI-tools without really knowing what you are looking for or why is not recommended.

patrickbryant1 commented 9 months ago

If you do this you get the following:

Screenshot 2023-12-13 at 13 19 17

The average ligand plDDT is: 54.2 This is quite low and the complex is probably inaccurate.

Running single predictions is the intended use for Umol. I recommend doing a large-scale screen towards your binding site and then verifying whatever you find in the lab. The inverse process you are currently applying is not very logical since you will have to go to the lab regardless.