DeepGraphLearning / GearNet

GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
MIT License
253 stars 28 forks source link

asking about how to obtain the new graph based on contrast learning #37

Closed Yanara-Tian closed 1 year ago

Yanara-Tian commented 1 year ago

Hello, because my code understanding ability is not very strong, I have a little problem in understanding the model: (Because I am very interested in your work, I am sorry to have a lot of questions~) Refer to the mc_gearnet_edge.yaml file, the Multiview Contrast in the model is followed by a multi-layer perceptron. However, the output in Multiview Contrast is divided into output1 and output2 consisting of graph features and node features, but there is only one input in MLP. 1) I would like to ask what is the input in MLP? 2) what is the model in the MultiviewContrast module? [["def init(self, model, crop_funcs, noise_funcs, num_mlp_layer=2, activation="relu", tau=0.07): super(MultiviewContrast, self).init()"]] is it GeometryAwareRelationalGraphNeuralNetwork? 3) In addition, which step did you obtain the new graph based on contrast learning mentioned in your article?(because the MultiviewContrast module has two outputs results, I don't know which one is better)

Looking forward to your reply very much!

Oxer11 commented 1 year ago

Thx for your interest in our work!

  1. We use the graph features outputed by the model as the input to the MLP, as shown in https://github.com/DeepGraphLearning/torchdrug/blob/a959f68f0c19f368be9e380f5a587de6970b3c67/torchdrug/models/infograph.py#L150-L151. Note that we use the same MLP for both views to ensure the siamese architecture.
  2. Yes. The model should be your protein structure encoder, which is GearNet is this case.
  3. I'm not sure whether I understand this question correctly. In contrastive learning, we construct two views for each graph with crop_func and noise_func. Then, after pre-training, we keep the encoder for downstream tasks. This means, if you have a new protein, you should feed it directly to the encoder. You don't need to feed it into MultiviewContrast and get views for it. MultiviewContrast is only used for pre-training.
Yanara-Tian commented 1 year ago

Thank you very much for your detailed answer, I benefited a lot.

For the third question, you mean that MultiviewContrast is only for training to get a good protein encoder, but we only need to use this encoder when encoding the protein to the downstream tasks. In this model, the encoder is GearNet.

Oxer11 commented 1 year ago

Yes.

Yanara-Tian commented 1 year ago

thank you very much, your work is very excellent and I have learned a lot.

Oxer11 commented 1 year ago

Feel free to send me an email (zuobai.zhang@mila.quebec) if you have other questions~

Yanara-Tian commented 1 year ago

ok, thank you best wishes!