LPDI-EPFL / masif

MaSIF- Molecular surface interaction fingerprints. Geometric deep learning to decipher patterns in molecular surfaces.
Apache License 2.0
582 stars 154 forks source link

pdl1_benchmark do not predict interface between 4ZQK_A and 4ZQK_B #13

Closed stebliankin closed 3 years ago

stebliankin commented 4 years ago

Hi,

I tried to run pdl1_benchmark with pdl1_benchmark_nn.py, but the 4ZQK_A and 4ZQK_B are not in the list of top_scores. They matched only when I lower iface_cutoff to 0.5, and only at the single point near_points: [1820] iface: [0.5888073] diff: [1.696772]

Is the model provided in masif/data/masif_pdl1_benchmark/nn_models/ the same as described in the paper, or for the reproduction of the results I should train the model myself?

Thank you so much!

pablogainza commented 4 years ago

Hi! Did you try this on Docker? Yeah there were some differences in that version. Could you try to set the iface_cutoff to the default value (0.8), but increase DESC_DISC_CUTOFF to 2.5? (DESC_DIST_CUTOFF=2.5)

Please let me know if this works!

stebliankin commented 4 years ago

Thank you for a quick reply! I use the source code from this repo without a docker.

I tried to set DESC_DIST_CUTOFF=2.5. It found more patches as the best candidates for 4ZQK_A-4ZQK_B complex, but non of them passed an alignment threshold at the end. In fact, every alignment score for each protein from multidock function is less than 0.5.

However, when I try "pdl_benchmark.py" instead of "pdl_benchmark_nn.py" it does select 4ZQK_A and 4ZQK_B as the best pair. Something is wrong with the "nn_score" that evaluates the alignment...

pablogainza commented 4 years ago

Sorry about the late reply. If it still an issue could you give me more information? Are all scores 0? The docker version seems to work well, which is strange since the docker version downloads the solution from github. Could you set the DESC_DISC_CUTOFF even higher and limit the search to just PDL1?

stebliankin commented 4 years ago

Thank you for your reply! I have set DESC_DIST_CUTOFF to 3. Below are all the scores for 4ZQK_A-4ZQK_B:

Difference between descriptors: [0.82679707 0.81521094 0.929164 0.89320594 0.87356883 0.87354124 0.9221764 0.81365734 0.8651147 0.86851555 0.9054936 0.8775244 0.8802839 0.91344434 0.9137925 0.9305273 0.8980579 0.9060863 0.84222406 0.9121367 0.8704249 0.83516985 0.9099729 0.801642 0.8600515 0.80935884 0.8733127 0.93689716 0.87480325 0.8886727 0.85552794 0.8451138 0.86734414 0.9004062 0.81604356 0.83216256 0.83947295 0.85944843]

scores for alignment from multidock: [0.5004919 0.5004555 0.5003996 0.50052327 0.5004255 0.5005098 0.50045395 0.50053674 0.50055635 0.50043255 0.50042903 0.5003466 0.5003477 0.50033927 0.5004745 0.500459 0.5004788 0.50030833 0.500387 0.50050133 0.50043577 0.50049 0.5004223 0.5004021 0.5004072 0.5005649 0.50046057 0.5003941 0.50044996 0.5004093 0.5003859 0.5005203 0.5004807 0.5004856 0.5003352 0.50040734 0.50058347 0.5004427 ]

In fact, the alignment score between any pairs of protein chains is about 0.5 The nn_score model was taken from "masif/data/masif_pdl1_benchmark/models"

The only modification I made is the tf version. The requirements state that the tensorflow version should be 1.9.0. However, score_nn.py uses keras.layers.ReLU(), which is not available in the tf 1.9.0. It produces the following error: AttributeError: module 'tensorflow.tools.api.generator.api.keras.layers' has no attribute 'ReLU'.

When I updated the tf to 1.10.0, the error disappeared. The weird behavior of nn_score might be because of this version difference. Did you train it under 1.9.0? If so, how did avoided keras.layers.ReLU() problem?

stebliankin commented 3 years ago

Update: The issue is resolved by updating tensorflow to v. 1.12. The second stage alignment works as expected. Thank you!