Closed onlyonewater closed 1 year ago
Hi! If you want to extract protein features with GearNet, I suggest to read this tutorial carefully. Basically, you only need to write your customized dataset and then use GearNet as your encoder.
ok, got it, I will have a try. thanks!
Hello, I haven't used pretrained models before, and after reading the documentation, I still feel a bit confused. I want to use my own PDB dataset, but I don't know how to load it and apply the pretrained model to obtain its representation
Hi, to use the pre-trained models, please check the last section of this tutorial for pre-training and fine-tuning. You will find how to load the pre-trained model and fine-tune on your own dataset.
Hello, I'm sorry to bother you again. I hope to use GearNet loaded with pre-trained weights to extract protein features, but the program has neither output nor termination. What's the problem?
# protein
protein = data.Protein.from_pdb(pdb_file, atom_feature="position", bond_feature="length", residue_feature="symbol")
_protein = data.Protein.pack([protein])
protein = graph_construction_model(_protein)
# model
gearnet_edge = models.GearNet(input_dim=21, hidden_dims=[512, 512, 512, 512, 512, 512],
num_relation=7, edge_input_dim=59, num_angle_bin=8,
batch_norm=True, concat_hidden=True, short_cut=True, readout="sum")
pthfile = 'models/angle_gearnet_edge.pth'
net = torch.load(pthfile)
gearnet_edge.load_state_dict(net)
#output
with torch.no_grad():
output = gearnet_edge(protein, protein.node_feature.float(), all_loss=None, metric=None)
Hi, the code looks good to me.
Could you provide more information about the bug? What does it mean by no output from the program? It seems that you haven't included a print
in your code. If the code runs without termination, could you please show which part the code will stuck at?
The program has been running the last line of code:
output = gearnet_edge(protein, protein.node_feature.float(), all_loss=None, metric=None)
I think it's simply because the model hasn't finished yet, since you're running the code on CPU without putting the model on GPU.
I tried to replace 'utils. sparse_coo_tensor' with 'torch. sparse_coo_tensor' in line 802 of "layers.conv.py" , and the program was able to continue executing (although an error was reported later).
This may be due to the compilation problem of torch_ext
. You can check this issue. https://github.com/DeepGraphLearning/torchdrug/issues/8#issuecomment-916706055
BTW, to generate the embeddings, don't forget to switch the model to .eval()
mode by calling gearnet_edge.eval()
.
Is it written like this?
gearnet_edge.eval()
output = gearnet_edge(graph=protein, input=protein.node_feature.float())
Yes!
Hello, I haven't used pretrained models before, and after reading the documentation, I still feel a bit confused. I want to use my own PDB dataset, but I don't know how to load it and apply the pretrained model to obtain its representation
Hello, I want to consult about you using your own PDB dataset to build Graph, whether to implement, I also want to build a protein graph on my own dataset, I want to ask you about the implementation of this part of the sale, I hope to get your sharing, thank you very much.
Is it written like this?
gearnet_edge.eval() output = gearnet_edge(graph=protein, input=protein.node_feature.float())
Hi pearl-rabbit: I am also trying to figure out how to use GearNet loaded with pre-trained weights to extract protein features. I am writing the same code as you, and below is my code:
import os
import sys
import argparse
import torch
from torchdrug import core
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
from gearnet.model import GearNetIEConv
from torchdrug.data import Protein
from torchdrug.core import Registry as R
from torchdrug import data, utils
from torchdrug import layers
from torchdrug.layers import geometry
from torchdrug import models
pdb_file = utils.download("https://files.rcsb.org/download/2LWZ.pdb", "./")
graph_construction_model = layers.GraphConstruction(node_layers=[geometry.AlphaCarbonNode()],
edge_layers=[geometry.SpatialEdge(radius=10.0, min_distance=5),
geometry.KNNEdge(k=10, min_distance=5),
geometry.SequentialEdge(max_distance=2)],
edge_feature="gearnet")
# protein
protein = Protein.from_pdb(pdb_file, atom_feature="position", bond_feature="length", residue_feature="symbol")
_protein = Protein.pack([protein])
protein = graph_construction_model(_protein)
# model
gearnet_edge = models.GearNet(input_dim=21, hidden_dims=[512, 512, 512, 512, 512, 512],
num_relation=7, edge_input_dim=59, num_angle_bin=8,
batch_norm=True, concat_hidden=True, short_cut=True, readout="sum")
pthfile = '/content/angle_gearnet_edge.pth'
net = torch.load(pthfile, map_location=torch.device('cpu'))
gearnet_edge.load_state_dict(net)
#output
with torch.no_grad():
gearnet_edge.eval()
print(protein)
output = gearnet_edge(protein, protein.node_feature.float(), all_loss=None, metric=None)
print(output)
However, I am getting following error:
Traceback (most recent call last):
File "/content/GearNet-main/script/test.py", line 40, in <module>
output = gearnet_edge(protein, protein.node_feature.float(), all_loss=None, metric=None)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torchdrug/models/gearnet.py", line 95, in forward
hidden = self.layers[i](graph, layer_input)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torchdrug/layers/conv.py", line 91, in forward
update = self.message_and_aggregate(graph, input)
File "/usr/local/lib/python3.10/dist-packages/torchdrug/layers/conv.py", line 813, in message_and_aggregate
return update.view(graph.num_node, self.num_relation * self.input_dim)
RuntimeError: shape '[57, 147]' is invalid for input of size 1197
Can you please take a look at my code? Thank you so much
hi, authors, a great work, I want to use the GearNet as a feature extractor to extract the protein features, how to use it?
thanks!!!