constructing a protein map code

vtarasv / 3d-prot-dta

3DProtDTA: a deep learning model for drug-target affinity prediction based on residue-level protein graphs

16 stars 2 forks source link

constructing a protein map code #5

Open zhaolongNCU opened 9 months ago

zhaolongNCU commented 9 months ago

Thank you very much for your important contribution in this field, please ask you about whether you can give the code for constructing a protein map, I would like to utilize this protein map construction method of yours to process another batch of data, thank you very much!

vtarasv commented 9 months ago

The concrete code for data processing and feature engineering is currently a trade secret used by the RECEPTOR.AI company. Therefore, I cannot directly share it. However, you can find useful information in this issue.

shrimonmuke0202 commented 8 months ago

Hi, I want to how you do domain annotations from UniProt16 to determine the ligand binding sites for a particular protein?

vtarasv commented 8 months ago

Hi, I want to how you do domain annotations from UniProt16 to determine the ligand binding sites for a particular protein?

You can find the answer here: https://github.com/vtarasv/3d-prot-dta/issues/3#issuecomment-1673372653

shrimonmuke0202 commented 8 months ago

Thank you for your response. I am interested in generating residue graphs from PDB files and would appreciate if you could share the source code or method you used for this purpose.

Additionally, I have a question regarding the treatment of mutated proteins within the DAVIS dataset, especially in relation to PDB files generated by AlphaFold. For example, in the file named "ABL1(E255K)-phosphorylated.pdb", the amino acid residues are numbered from 1 to 251, yet the mutation occurs at position 255 (from E to K). If the sequence is truncated to 251 residues before using AlphaFold, the mutation at position 255 would not be included. Could you explain how you addressed this issue?

vtarasv commented 8 months ago

Thank you for your response. I am interested in generating residue graphs from PDB files and would appreciate if you could share the source code or method you used for this purpose.

Additionally, I have a question regarding the treatment of mutated proteins within the DAVIS dataset, especially in relation to PDB files generated by AlphaFold. For example, in the file named "ABL1(E255K)-phosphorylated.pdb", the amino acid residues are numbered from 1 to 251, yet the mutation occurs at position 255 (from E to K). If the sequence is truncated to 251 residues before using AlphaFold, the mutation at position 255 would not be included. Could you explain how you addressed this issue?

I cannot provide the source code, but you can find the hints for generating protein graphs here: https://github.com/vtarasv/3d-prot-dta/issues/2#issuecomment-1544427302

Regarding your question, you are right, the mutations are not always included in the generated structures. We didn't address those issues in the work as it didn't concern the vast majority of proteins.

shrimonmuke0202 commented 8 months ago

Thanks, but you did not mentioned that how you generate residue graph, could you give some hint how to get residue graph?

vtarasv commented 8 months ago

Thanks, but you did not mentioned that how you generate residue graph, could you give some hint how to get residue graph?

The concrete code for data processing and feature engineering is currently a trade secret used by the RECEPTOR.AI. Thus, I cannot provide the source code, but you can find the hints for generating residue graphs here: https://github.com/vtarasv/3d-prot-dta/issues/2#issuecomment-1544427302

shrimonmuke0202 commented 8 months ago

Can you give hint only the graph creation part of the data preprocessing it will help me very much

vtarasv commented 8 months ago

Can you give hint only the graph creation part of the data preprocessing it will help me very much

You may use scalar graph features proposed by TankBind, it should work just fine: https://github.com/luwei0917/TankBind/blob/ff85f511db11d7a3e648d2e01cd6fdb4f9823483/tankbind/feature_utils.py#L201

shrimonmuke0202 commented 8 months ago

Thanks for your prompt reply, I will look into it. If there is any problem I will contact you.

shrimonmuke0202 commented 8 months ago

How you calculate Perpendicular pi-stacking and Parallel pi-stacking, to generate the edge between residue.

vtarasv commented 8 months ago

How you calculate Perpendicular pi-stacking and Parallel pi-stacking, to generate the edge between residue.

I used Open Drug Discovery Toolkit by considering each residue as separate molecule. https://oddt.readthedocs.io/en/latest/rst/oddt.html#module-oddt.interactions

shrimonmuke0202 commented 8 months ago

How you consider each residue as separate molecule? actually I am new to Open Drug Discovery Toolkit

vtarasv commented 8 months ago

How you consider each residue as separate molecule? actually I am new to Open Drug Discovery Toolkit

You may split a PDB file by residue ID and treat each resulting file as a separate molecule in PDB format.

shrimonmuke0202 commented 8 months ago

Can you give an example how can I apply this methods, it will be very help for me