Closed alkaou closed 1 month ago
You can follow the instruction as error message suggested to register the custom objects.
Here are the multiple ways of doing it https://keras.io/guides/serialization_and_saving/#custom-objects
You can follow the instruction as error message suggested to register the custom objects.
Here are the multiple ways of doing it https://keras.io/guides/serialization_and_saving/#custom-objects
I've tried everything possible, but I still can't load my saved model. I made a git repository. you can please visit my code to help me.
code is here :
https://github.com/alkaou/GenIA_LLM.git
You are writing your keras custom layer but you have keras passing its own arguments like trainable or non-trainable into its layers. so you should pass the **kwargs into all the layers and models because we don't know what arguments are passing during training
either you give the arguments passed onto the custom functions from outside the create_model() like these
vocab_size = 20000 # Only consider the top 20k words
maxlen = 80 # Max sequence size
embed_dim = 256 # Embedding size for each token
num_heads = 4 # Number of attention heads
feed_forward_dim = 256 # Hidden layer size in feed forward network inside transformer
to the custom_objects in the load_model's argument or you can give them in config
Method 1:
class TransformerBlock(layers.Layer):
def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1,**kwargs): # pass the **kwargs which takes the training = True arguments when training and False when predicting
super().__init__(**kwargs) # pass it to super so it properly flows into the orginal keras.layers.Layer
self.att = layers.MultiHeadAttention(num_heads, embed_dim)
self.ffn = keras.Sequential(
[
layers.Dense(ff_dim, activation="relu"),
layers.Dense(embed_dim),
]
)
self.layernorm1 = layers.LayerNormalization(epsilon=1e-6)
self.layernorm2 = layers.LayerNormalization(epsilon=1e-6)
self.dropout1 = layers.Dropout(rate)
self.dropout2 = layers.Dropout(rate)
def call(self, inputs):
input_shape = ops.shape(inputs)
batch_size = input_shape[0]
seq_len = input_shape[1]
causal_mask = causal_attention_mask(batch_size, seq_len, seq_len, "bool")
attention_output = self.att(inputs, inputs, attention_mask=causal_mask)
attention_output = self.dropout1(attention_output)
out1 = self.layernorm1(inputs + attention_output)
ffn_output = self.ffn(out1)
ffn_output = self.dropout2(ffn_output)
return self.layernorm2(out1 + ffn_output)
# same with TokenAndPositionEmbedding
class TokenAndPositionEmbedding(layers.Layer):
def __init__(self, maxlen, vocab_size, embed_dim,**kwargs):
super().__init__(**kwargs)
self.token_emb = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
self.pos_emb = layers.Embedding(input_dim=maxlen, output_dim=embed_dim)
def call(self, x):
maxlen = ops.shape(x)[-1]
positions = ops.arange(0, maxlen, 1)
positions = self.pos_emb(positions)
x = self.token_emb(x)
return x + positions
the while loading you gotta all the arguments into the custom objects in load model
keras.models.load_model(
"br_model.keras",
custom_objects={
"vocab_size":20000,
"maxlen":80,
"embed_dim":256,
"num_heads":4,
"feed_forward_dim":256,
"TokenAndPositionEmbedding":TokenAndPositionEmbedding,
"TransformerBlock":TransformerBlock
}
)
or you could do this and avoid it passing in custom_objects
class TransformerBlock(layers.Layer):
def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1,**kwargs):
super().__init__(**kwargs)
self.att = layers.MultiHeadAttention(num_heads, embed_dim)
self.ffn = keras.Sequential(
[
layers.Dense(ff_dim, activation="relu"),
layers.Dense(embed_dim),
]
)
self.layernorm1 = layers.LayerNormalization(epsilon=1e-6)
self.layernorm2 = layers.LayerNormalization(epsilon=1e-6)
self.dropout1 = layers.Dropout(rate)
self.dropout2 = layers.Dropout(rate)
def call(self, inputs):
input_shape = ops.shape(inputs)
batch_size = input_shape[0]
seq_len = input_shape[1]
causal_mask = causal_attention_mask(batch_size, seq_len, seq_len, "bool")
attention_output = self.att(inputs, inputs, attention_mask=causal_mask)
attention_output = self.dropout1(attention_output)
out1 = self.layernorm1(inputs + attention_output)
ffn_output = self.ffn(out1)
ffn_output = self.dropout2(ffn_output)
return self.layernorm2(out1 + ffn_output)
def get_config(self):
config = super().get_config().copy()
config.update(
{
"embed_dim":self.att.key_dim,
"num_heads": self.at.num_heads,
"ff_dim": self.ffn.layers[0].units,
"rate": self.dropout1.rate,
}
)
return config
@classmethod
def from_config(cls,config):
return cls(**config)
class TokenAndPositionEmbedding(layers.Layer):
def __init__(self, maxlen, vocab_size, embed_dim,**kwargs):
super().__init__(**kwargs)
self.token_emb = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
self.pos_emb = layers.Embedding(input_dim=maxlen, output_dim=embed_dim)
def call(self, x):
maxlen = ops.shape(x)[-1]
positions = ops.arange(0, maxlen, 1)
positions = self.pos_emb(positions)
x = self.token_emb(x)
return x + positions
def get_config(self):
config = super().get_config().copy()
config.update(
{
"maxlen": self.pos_emb.input_dim,
"vocab_size": self.token_emb.input_dim,
"embed_dim": self.token_emb.output_dim
}
)
return config
@classmethod
def from_config(cls,config):
return cls(**config)
and when loading model
keras.models.load_model(
"br_model.keras",
custom_objects={
"TokenAndPositionEmbedding":TokenAndPositionEmbedding,
"TransformerBlock":TransformerBlock
}
)
See whether this solves it
Vous écrivez votre couche personnalisée keras mais vous avez des keras qui transmettent ses propres arguments comme entraînable ou non entraînable dans ses couches. vous devez donc transmettre les **kwargs dans toutes les couches et tous les modèles car nous ne savons pas quels arguments sont transmis pendant la formation
soit vous donnez les arguments transmis aux fonctions personnalisées depuis l'extérieur de create_model() comme ceux-ci
vocab_size = 20000 # Only consider the top 20k words maxlen = 80 # Max sequence size embed_dim = 256 # Embedding size for each token num_heads = 4 # Number of attention heads feed_forward_dim = 256 # Hidden layer size in feed forward network inside transformer
aux custom_objects dans l'argument du load_model ou vous pouvez les donner dans la configuration
Méthode 1 :
class TransformerBlock(layers.Layer): def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1,**kwargs): # pass the **kwargs which takes the training = True arguments when training and False when predicting super().__init__(**kwargs) # pass it to super so it properly flows into the orginal keras.layers.Layer self.att = layers.MultiHeadAttention(num_heads, embed_dim) self.ffn = keras.Sequential( [ layers.Dense(ff_dim, activation="relu"), layers.Dense(embed_dim), ] ) self.layernorm1 = layers.LayerNormalization(epsilon=1e-6) self.layernorm2 = layers.LayerNormalization(epsilon=1e-6) self.dropout1 = layers.Dropout(rate) self.dropout2 = layers.Dropout(rate) def call(self, inputs): input_shape = ops.shape(inputs) batch_size = input_shape[0] seq_len = input_shape[1] causal_mask = causal_attention_mask(batch_size, seq_len, seq_len, "bool") attention_output = self.att(inputs, inputs, attention_mask=causal_mask) attention_output = self.dropout1(attention_output) out1 = self.layernorm1(inputs + attention_output) ffn_output = self.ffn(out1) ffn_output = self.dropout2(ffn_output) return self.layernorm2(out1 + ffn_output) # same with TokenAndPositionEmbedding class TokenAndPositionEmbedding(layers.Layer): def __init__(self, maxlen, vocab_size, embed_dim,**kwargs): super().__init__(**kwargs) self.token_emb = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim) self.pos_emb = layers.Embedding(input_dim=maxlen, output_dim=embed_dim) def call(self, x): maxlen = ops.shape(x)[-1] positions = ops.arange(0, maxlen, 1) positions = self.pos_emb(positions) x = self.token_emb(x) return x + positions
pendant le chargement, vous devez tous les arguments dans les objets personnalisés dans le modèle de chargement
keras.models.load_model( "br_model.keras", custom_objects={ "vocab_size":20000, "maxlen":80, "embed_dim":256, "num_heads":4, "feed_forward_dim":256, "TokenAndPositionEmbedding":TokenAndPositionEmbedding, "TransformerBlock":TransformerBlock } )
ou vous pouvez le faire et éviter de le transmettre à custom_objects
class TransformerBlock(layers.Layer): def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1,**kwargs): super().__init__(**kwargs) self.att = layers.MultiHeadAttention(num_heads, embed_dim) self.ffn = keras.Sequential( [ layers.Dense(ff_dim, activation="relu"), layers.Dense(embed_dim), ] ) self.layernorm1 = layers.LayerNormalization(epsilon=1e-6) self.layernorm2 = layers.LayerNormalization(epsilon=1e-6) self.dropout1 = layers.Dropout(rate) self.dropout2 = layers.Dropout(rate) def call(self, inputs): input_shape = ops.shape(inputs) batch_size = input_shape[0] seq_len = input_shape[1] causal_mask = causal_attention_mask(batch_size, seq_len, seq_len, "bool") attention_output = self.att(inputs, inputs, attention_mask=causal_mask) attention_output = self.dropout1(attention_output) out1 = self.layernorm1(inputs + attention_output) ffn_output = self.ffn(out1) ffn_output = self.dropout2(ffn_output) return self.layernorm2(out1 + ffn_output) def get_config(self): config = super().get_config().copy() config.update( { "embed_dim":self.att.key_dim, "num_heads": self.at.num_heads, "ff_dim": self.ffn.layers[0].units, "rate": self.dropout1.rate, } ) return config @classmethod def from_config(cls,config): return cls(**config) class TokenAndPositionEmbedding(layers.Layer): def __init__(self, maxlen, vocab_size, embed_dim,**kwargs): super().__init__(**kwargs) self.token_emb = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim) self.pos_emb = layers.Embedding(input_dim=maxlen, output_dim=embed_dim) def call(self, x): maxlen = ops.shape(x)[-1] positions = ops.arange(0, maxlen, 1) positions = self.pos_emb(positions) x = self.token_emb(x) return x + positions def get_config(self): config = super().get_config().copy() config.update( { "maxlen": self.pos_emb.input_dim, "vocab_size": self.token_emb.input_dim, "embed_dim": self.token_emb.output_dim } ) return config @classmethod def from_config(cls,config): return cls(**config)
et lors du chargement du modèle
keras.models.load_model( "br_model.keras", custom_objects={ "TokenAndPositionEmbedding":TokenAndPositionEmbedding, "TransformerBlock":TransformerBlock } )
Voyez si cela résout le problème
Thank you very so much. It's working.
brother close this as completed. Thank you.
Is it possible to use the Keras register_serializable decorator here and avoid using the custom scope?
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.
just another user here @emi-dm , i think the answer is yes (in fact it is recommended. i'd assume that adding the decorator to method 2 should work without declaring custom objects.
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.
This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.
After training I saved my model. and I can't load. I tried everything, but it always gives me a
custom_objects error
I based myself on the GPT miniature code in the doc
code:
### Now, when i try to load my model:
I'm getting this errors:
version