awslabs / dgl-lifesci

Python package for graph neural networks in chemistry and biology
Apache License 2.0
714 stars 147 forks source link

load_pretrained #155

Closed ph-mehdi closed 2 years ago

ph-mehdi commented 2 years ago

I want to change the model selected as a pretrain, for example GCN_attentivefp_SIDER, the last layer and change the output to 100 instead of 27. And also how can I access the implementations of this models?

mufeili commented 2 years ago

You can do something as follows

from dgllife.model import MLPPredictor

# Assume you have load the pre-trained model parameters
model.predict = MLPPredictor(2 * gnn_out_feats, predictor_hidden_feats, 100, predictor_dropout)

See the model definition here.

ph-mehdi commented 2 years ago

Thank you for your answer. For tox21, I can easily get the vector without changing the last layer, but I do the same for sider, I get an error.


ValueError Traceback (most recent call last)

in () ----> 1 label_pred = model_sider(g, feats) 2 print(smiles) 3 print(label_+pred[:, mask_s != 0]) 9 frames /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in _verify_batch_size(size) 2245 size_prods *= size[i + 2] 2246 if size_prods == 1: -> 2247 raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size)) 2248 2249 ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])
mufeili commented 2 years ago

What's the shape of feats in this case?

ph-mehdi commented 2 years ago

The shape of the feats is a torch.Size([24,74]). The model is "GCN_attentivefp_SIDER". I wanted to change the last layer of this model which is inside the MLPPredictor(a sequential layer).

GCNPredictor( (gnn): GCN( (gnn_layers): ModuleList( (0): GCNLayer( (graph_conv): GraphConv(in=39, out=256, normalization=none, activation=<function relu at 0x7f971e0f5200>) (dropout): Dropout(p=0.08333992387843633, inplace=False) (bn_layer): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (1): GCNLayer( (graph_conv): GraphConv(in=256, out=256, normalization=none, activation=<function relu at 0x7f971e0f5200>) (dropout): Dropout(p=0.08333992387843633, inplace=False) (bn_layer): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (2): GCNLayer( (graph_conv): GraphConv(in=256, out=256, normalization=none, activation=<function relu at 0x7f971e0f5200>) (dropout): Dropout(p=0.08333992387843633, inplace=False) (bn_layer): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (3): GCNLayer( (graph_conv): GraphConv(in=256, out=256, normalization=none, activation=<function relu at 0x7f971e0f5200>) (dropout): Dropout(p=0.08333992387843633, inplace=False) (bn_layer): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) ) (readout): WeightedSumAndMax( (weight_and_sum): WeightAndSum( (atom_weighting): Sequential( (0): Linear(in_features=256, out_features=1, bias=True) (1): Sigmoid() ) ) ) (predict): MLPPredictor( (predict): Sequential( (0): Dropout(p=0.08333992387843633, inplace=False) (1): Linear(in_features=512, out_features=1024, bias=True) (2): ReLU() (3): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Linear(in_features=1024, out_features=27, bias=True) ) ) )

mufeili commented 2 years ago

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])

This is probably due to the use of something like batch normalization, which requires a batch size bigger than 1 during training. Typically if you use a batch size B greater than 1 during training, the last batch can have a batch size smaller than B, which can be 1 in your case. You can simply drop the last batch during training.

ph-mehdi commented 2 years ago

@mufeili Thank you for your answer I'm adding GCN_attentivefp_SIDER. But I want to give the sider set to it, I get the following error.

in () 2 smiles, g, label, mask = dataset[0] 3 feats = g.ndata.pop('h') ----> 4 label_pred = model(g, feats) 5 print(smiles) # CCOc1ccc2nc(S(N)(=O)=O)sc2c1 6 print(label_pred[:, mask != 0]) 7 frames /usr/local/lib/python3.7/dist-packages/dgl/nn/pytorch/conv/graphconv.py in forward(self, graph, feat, weight, edge_weight) 424 rst = graph.dstdata['h'] 425 if weight is not None: --> 426 rst = th.matmul(rst, weight) 427 428 if self._norm != 'none': RuntimeError: mat1 and mat2 shapes cannot be multiplied (13x74 and 39x256)
mufeili commented 2 years ago

@mufeili Thank you for your answer I'm adding GCN_attentivefp_SIDER. But I want to give the sider set to it, I get the following error.

in () 2 smiles, g, label, mask = dataset[0] 3 feats = g.ndata.pop('h') ----> 4 label_pred = model(g, feats) 5 print(smiles) # CCOc1ccc2nc(S(N)(=O)=O)sc2c1 6 print(label_pred[:, mask != 0])

7 frames /usr/local/lib/python3.7/dist-packages/dgl/nn/pytorch/conv/graphconv.py in forward(self, graph, feat, weight, edge_weight) 424 rst = graph.dstdata['h'] 425 if weight is not None: --> 426 rst = th.matmul(rst, weight) 427 428 if self._norm != 'none':

RuntimeError: mat1 and mat2 shapes cannot be multiplied (13x74 and 39x256)

Can you provide your DGL version, DGL-LifeSci version and a runnable script to reproduce the issue you encountered?

ph-mehdi commented 2 years ago

DGL version = 0.6.1 DGL-LifeSci = 0.2.8 surce code:

from dgllife.data import SIDER from dgllife.data import SIDER from dgllife.model import load_pretrained from dgllife.utils import smiles_to_bigraph, CanonicalAtomFeaturizer

dataset = SIDER(smiles_to_bigraph, CanonicalAtomFeaturizer()) model = load_pretrained('GCN_attentivefp_SIDER') # Pretrained model loaded model.eval()

smiles, g, label, mask = dataset[2] feats = g.ndata.pop('h') label_pred = model(g, feats) print(smiles)
print(label_pred)

I think the problem is with the input of the "GCN_attentivefp_SIDER" model. Because the shape is a variable feat, it is torch.Size ([23, 74])

mufeili commented 2 years ago

You need to use a different featurizer for models marked with "attentivefp"

from dgllife.utils import smiles_to_bigraph, AttentiveFPAtomFeaturizer

dataset = SIDER(smiles_to_bigraph, AttentiveFPAtomFeaturizer())
ph-mehdi commented 2 years ago

Thanks, it worked