anton-bushuiev / PPIRef

Dataset and package for working with protein-protein interactions in 3D
https://ppiref.readthedocs.io
MIT License
57 stars 5 forks source link

Doubts during use #4

Closed LDAIprotein closed 3 weeks ago

LDAIprotein commented 1 month ago

Hello, this is a very good work, but I found a problem that bothered me during use. I used a protein 7qih_A_B (CASP15 name: H1106) to search for similar PPIs and found that only 1xxh_A_B (IDist = 0.0303 IS-score = 0.146) was searched. Further, I used two prediction models of 7qih (completely different docking modes, one of which was a high-scoring prediction model TM: 0.933 and the other was a low-scoring prediction model TM: 0.194) to search for similar PPIs, but got the same result. This is a very strange phenomenon. High quality model: 1717134018717 PPI search results: image

Low quality model: image PPI search results: image

LDAIprotein commented 1 month ago

Does this mean that the PPI pattern of the query is highly correlated with its sequence?

anton-bushuiev commented 1 month ago

Hi @LDAIprotein ,

This is extremely unlikely to get nearly the same results for so different interactions. My guess would be that this is caused by some bug leading to the same PPI interface being actually used for the search (and slightly different distances for some of the found entries could be probably caused by numerical issues).

If you share the script you are using, I can take a look.

LDAIprotein commented 1 month ago

Thank you for giving us feedback in time. We know the reason: the input PDB structure is full-length (not CutOff 6A), which leads to the exact same search results. This is a wrong way. But through this way, we also understand that the search method of IDist seems to have some correlation with the sequence.

anton-bushuiev commented 1 month ago

I see. It makes good sense! Please let me know if there is anything else unclear.

LDAIprotein commented 1 month ago

Thank you. I have no other questions. We believe that your work will have a positive impact on this field. This is a very good work. We look forward to your 10A data and a series of extended research.