Open rnilsenhub opened 1 year ago
For BP ontology prediction, there is only one hidden layer in pre-trained PFresGO model ("--num_hidden_layers 1"), while for MF and CC ontology prediction, the number of hidden layers in pre-trained PFresGO model is 2 ("--num_hidden_layers 2"). Please carefully check the parameters~
python predict.py --num_hidden_layers 2 --ontology 'mf' --model_name 'MF_PFresGO' --res_embeddings './Datasets/per_residue_embeddings.h5'
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
padding (InputLayer) [(None, None)] 0
__________________________________________________________________________________________________
res_embed (InputLayer) [(None, None, 1024)] 0
__________________________________________________________________________________________________
seq (InputLayer) [(None, None, 26)] 0
__________________________________________________________________________________________________
tf.math.equal (TFOpLambda) (None, None) 0 padding[0][0]
__________________________________________________________________________________________________
model (Functional) (None, None, 128) 1344896 res_embed[0][0]
__________________________________________________________________________________________________
AA_embedding (Dense) (None, None, 128) 3328 seq[0][0]
__________________________________________________________________________________________________
tf.cast (TFOpLambda) (None, None) 0 tf.math.equal[0][0]
__________________________________________________________________________________________________
Add_embedding (Add) (None, None, 128) 0 model[0][0]
AA_embedding[0][0]
__________________________________________________________________________________________________
tf.__operators__.getitem (Slici (None, 1, 1, None) 0 tf.cast[0][0]
__________________________________________________________________________________________________
decoder_1 (Decoder) ((1, 489, 128), {'de 396160 Add_embedding[0][0]
tf.__operators__.getitem[0][0]
__________________________________________________________________________________________________
group_wise_linear (GroupWiseLin (1, 489) 302691 decoder_1[0][0]
==================================================================================================
Total params: 2,047,075
Trainable params: 2,047,075
Non-trainable params: 0
__________________________________________________________________________________________________
ValueError: Layer #2 (named "decoder_1" in the current model) was found to correspond to layer decoder_1 in the save file. However the new layer decoder_1 expects 26 weights, but the saved weights have 52 elements.
Why? @BioColLab
Hi, I have updated the pre-trained model. Please go ahead and re-download it, and now it runs smoothly.
Thank you very much,now it runs smoothly. But, I found and fix one problem when I try it.
Problem
TypeError: 'list' object cannot be interpreted as an integer
Traceback (most recent call last):
File "predict.py", line 60, in <module>
model_name_prefix=args.model_name, label_embedding=go_emb, hidden_size=args.hidden_size, num_heads=args.num_heads, autoencoder_name=args.autoencoder_name)
File "/home/liujianan/projects/PFresGO/pfresgo/PFresGO.py", line 21, in __init__
self.decoder = Decoder(num_hidden_layers, hidden_size, num_heads, dff, output_dim, rate=0.1)
File "/home/liujianan/projects/PFresGO/pfresgo/PFresGO_decoder.py", line 15, in __init__
self.dec_layers = [DecoderLayer(d_model, num_heads, dff, rate) for _ in range(num_layers)]
TypeError: 'list' object cannot be interpreted as an integer
Fix: modify pfresgo/PFresGO_decoder.py
class Decoder(tf.keras.layers.Layer):
def __init__(self, num_layers, d_model, num_heads, dff, target_goemb_size, rate=0.1, **kwargs):
super(Decoder, self).__init__()
self.d_model = d_model
# self.num_layers = num_layers
self.num_layers = num_layers[0]
self.num_heads = num_heads
self.dff = dff
self.target_goemb_size = target_goemb_size
self.rate = rate
# self.dec_layers = [DecoderLayer(d_model, num_heads, dff, rate) for _ in range(num_layers)]
self.dec_layers = [DecoderLayer(d_model, num_heads, dff, rate) for _ in range(num_layers[0])]
self.dropout = tf.keras.layers.Dropout(rate)
super(Decoder, self).__init__(**kwargs)
Best wish.
I think this comes from previous code, in the parser, if nargs is "+" then it will be a list but here an int is wanted. I think the line https://github.com/BioColLab/PFresGO/blob/43d7abe4752a1cf8afb22cc41b349379e6018284/predict.py#L18C1-L18C42 :
parser.add_argument('-hlayer', '--num_hidden_layers', type=int, default=2, nargs='+', help="Number of hidden layers.")
should be changed to:
parser.add_argument('-hlayer', '--num_hidden_layers', type=int, default=2, nargs='?', help="Number of hidden layers.")
The XX_PFresGO_best_train_model.h5 /model_weights/model/encoder2/kernel:0 has a shape of (64, 256) . When running predict.py with the default arguments it results in the error: ValueError: Cannot assign to variable encoder2/kernel:0 due to variable shape (256, 128) and value shape (256, 64) are incompatible. Changing the hidden_size to 64 results in a different error, ValueError: Operands could not be broadcast together with shapes (None, 128) (None, 64). Thank you for publishing your research and this code repository.