BioinfoMachineLearning / DeepInteract

A geometric deep learning pipeline for predicting protein interface contacts. (ICLR 2022)
https://zenodo.org/record/6671582
GNU General Public License v3.0
63 stars 11 forks source link

About the feature generation process. #21

Open peter5842 opened 7 months ago

peter5842 commented 7 months ago

I notice a strange condition, when I input the 6cp8_a.pdb and 6cp8_c.pdb(which is the test data in CASP13&14), I can't get the same result in the test dataset (which you provide). I continue to check their different, I find the node feature [36:43] is different, So I guess the different result caused by the different. Here is the sample, the graph node feature generated from this repository lit_model_predict.py. Here is the first residue node feature from 6cp8_a.pdb generated by myself. [ 0.0000e+00, 1.0000e+00, -2.7021e-01, -9.8549e-01, 0.0000e+00, 9.6280e-01, -1.6976e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00, 1.0000e+00, 7.1768e-05, 9.1038e-01, 9.2546e-01, 6.3999e-01, 8.7012e-01, 1.0000e+00, 6.9612e-01, 0.0000e+00, 0.0000e+00, 5.0000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 5.0000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 3.2309e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 6.7689e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00] Here is you test dataset first residue node feature from 6cp8_a.pdb. [ 0.0000e+00, 1.0000e+00, -2.7021e-01, -9.8549e-01, 0.0000e+00, 9.6280e-01, -1.6976e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00, 1.0000e+00, 3.0420e-05, 1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00, 0.0000e+00, 0.0000e+00, 5.0000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 5.0000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 3.2309e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 6.7689e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00]

From making sure they are from the same pdb (uploaded), I compare their position and find they are the same. So I want to know how it happened. Thanks for you @amorehead . 6CP8_A_C.zip