keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.89k stars 19.45k forks source link

load_model - the two structures don't have the same nested structure #14345

Closed eyalshafran closed 2 years ago

eyalshafran commented 3 years ago

I trained a model and saved it using: model.save("model_name") I then call model = tf.keras.models.load_model("model_name") and get a "The two structures don't have the same nested structure." error. The structure that the model is expecting upon loading is not what I defined in the model.

My model:

roberta_layer = TFRobertaModel.from_pretrained('roberta-base')

ids = tf.keras.layers.Input((64,), dtype=tf.int32)
att = tf.keras.layers.Input((64,), dtype=tf.int32)

roberta_inputs = [ids, att]

sequence_output,pooled_output = roberta_layer(roberta_inputs)

# unigram
x1 = tf.keras.layers.Conv1D(32,1,activation='relu')(sequence_output)
x1 = tf.keras.layers.GlobalMaxPool1D()(x1)

# bigram
x2 = tf.keras.layers.Conv1D(32,2,activation='relu')(sequence_output)
x2 = tf.keras.layers.GlobalMaxPool1D()(x2)

# trigram
x3 = tf.keras.layers.Conv1D(32,3,activation='relu')(sequence_output)
x3 = tf.keras.layers.GlobalMaxPool1D()(x3)

concat = tf.keras.layers.Concatenate()([x1,x2,x3])
concat = tf.keras.layers.Dropout(0.5)(concat)

outputs = tf.keras.layers.Dense(11, activation='sigmoid')(concat)

model = tf.keras.Model(inputs=roberta_inputs, outputs=outputs)

I'm using tf 2.4.0 and transformers 3.5.1

The full error is: The two structures don't have the same nested structure.

First structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}

Second structure: type=list str=[TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/1')]

More specifically: The two namedtuples don't have the same sequence type. First structure type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')} has type dict, while second structure type=list str=[TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/1')] has type list

During handling of the above exception, another exception occurred:

wildangunawan commented 3 years ago

I'm having the same issue with TF 2.4.1 and Transformers 4.2.2.

First structure also have shape=(None, 5) while second structure is shape=(None, maximum length of a sequence).

Edit:

Workaround for BERT: Save your model as .h5 instead of .tf. I didn't test this for other but it might worth a try.

tarrade commented 3 years ago

Same issue with TF 2.5.0-rc3 and Transformer 4.5.1.

Using out of the box Bert classifier: model = TFBertForSequenceClassification.from_pretrained(bert-base-multilingual-uncased',num_labels=2) then traning the model and then trying to save it give me the same error

TypeError: The two structures don't have the same nested structure.

First structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}

Second structure: type=list str=[TensorSpec(shape=(None, 128), dtype=tf.int32, name='input_ids/0')]

saving the mode .hd5 results in the same error message we reloading the model: model.save('bert_input.hd5')

tarrade commented 3 years ago

Same issue with tensorflow 2.5.0-rc3 Using Huggingface save/load is working fine

model.save_pretrained('bert_input')
reconstructed_model = TFBertForSequenceClassification.from_pretrained('bert_input')

But this is not we need for tf.serving.

It seems some class/function definition are not saved with the model so reloading the model will not work. Maybe there are some paramters to set to have this working ? (Stackoverflow)

chenlongzhen commented 3 years ago

Same issue

samsatp commented 3 years ago

Same issue with transformers 4.6.1 and tensorflow 2.4.1

I built the model as the following.

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_4 (InputLayer)            [(None, 512)]        0                                            
__________________________________________________________________________________________________
input_3 (InputLayer)            [(None, 512)]        0                                            
__________________________________________________________________________________________________
tf_distil_bert_model_1 (TFDisti TFBaseModelOutput(la 66362880    input_4[0][0]                    
                                                                 input_3[0][0]                    
__________________________________________________________________________________________________
bidirectional_1 (Bidirectional) (None, 128)          320256      tf_distil_bert_model_1[1][7]     
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 16)           2064        bidirectional_1[0][0]            
__________________________________________________________________________________________________
dropout_58 (Dropout)            (None, 16)           0           dense_2[0][0]                    
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 10)           170         dropout_58[0][0]                 
==================================================================================================
Total params: 66,685,370
Trainable params: 322,490
Non-trainable params: 66,362,880
__________________________________________________________________________________________________

Then, it's trained and save as SavedModel. But when load it back, it's an error.

The two structures don't have the same nested structure.

First structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 512), dtype=tf.int32, name='input_ids/input_ids'), 'attention_mask': TensorSpec(shape=(None, 512), dtype=tf.int32, name='input_ids/attention_mask')}

More specifically: The two dictionaries don't have the same set of keys. First structure has keys type=list str=['input_ids'], while second structure has keys type=list str=['input_ids', 'attention_mask']
Entire first structure:
{'input_ids': .}
Entire second structure:
{'input_ids': ., 'attention_mask': .}
samsatp commented 3 years ago

I finally solved this issue by using tf.saved_model.save(model, path_to_dir) instead of model.save(path_to_dir) or tf.keras.models.save_model(model, path_to_dir).

followed by this guide: https://www.tensorflow.org/guide/saved_model

jvishnuvardhan commented 3 years ago

@eyalshafran Can you share a simple standalone code to reproduce the issue? Did you try recent TF/Keras versions? Thanks!

google-ml-butler[bot] commented 3 years ago

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

GiovanniGedosal commented 3 years ago

Same issue

Kdev369 commented 3 years ago

I am also facing the same issue image

and after applying saved_model instead of save.... again this error appeared tf.saved_model.save(intent_model,'/content/model')

image

Do tell me about this if someone has solved this issue.

jvishnuvardhan commented 3 years ago

Is it possible for anyone of you to share a simple standalone code to reproduce the issue? Thanks!

Nash2325138 commented 3 years ago

Just use the post's codes and I can reproduce it (tf.__version__: 2.2.0):

import tensorflow as tf
from transformers import TFRobertaModel

roberta_layer = TFRobertaModel.from_pretrained('roberta-base')

ids = tf.keras.layers.Input((64,), dtype=tf.int32)
att = tf.keras.layers.Input((64,), dtype=tf.int32)

roberta_inputs = [ids, att]

sequence_output,pooled_output = roberta_layer(roberta_inputs)

# unigram
x1 = tf.keras.layers.Conv1D(32,1,activation='relu')(sequence_output)
x1 = tf.keras.layers.GlobalMaxPool1D()(x1)

# bigram
x2 = tf.keras.layers.Conv1D(32,2,activation='relu')(sequence_output)
x2 = tf.keras.layers.GlobalMaxPool1D()(x2)

# trigram
x3 = tf.keras.layers.Conv1D(32,3,activation='relu')(sequence_output)
x3 = tf.keras.layers.GlobalMaxPool1D()(x3)

concat = tf.keras.layers.Concatenate()([x1,x2,x3])
concat = tf.keras.layers.Dropout(0.5)(concat)

outputs = tf.keras.layers.Dense(11, activation='sigmoid')(concat)

model = tf.keras.Model(inputs=roberta_inputs, outputs=outputs)

Then save the model

model.save('_tmp_model')

Then the following load will produce the error

restored_model = tf.keras.models.load_model("_tmp_model")

But if I load by tf.saved_model.load instead of tf.keras.models.load_model, it will succeed

restored_model = tf.saved_model.load("_tmp_model")

Also tried saving with tf.keras.models.save_model instead of model.save, the same error still exists

tf.keras.models.save_model(model, '_tmp_model_keras')
restored_model = tf.keras.models.load_model('_tmp_model_keras')

I also tried tf 2.6.0, still the same. Here's the error it will produce:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/anaconda/lib/python3.7/site-packages/tensorflow/python/util/nest.py in assert_same_structure(nest1, nest2, check_types, expand_composites)
    527     _pywrap_utils.AssertSameStructure(nest1, nest2, check_types,
--> 528                                       expand_composites)
    529   except (ValueError, TypeError) as e:

TypeError: The two structures don't have the same nested structure.

First structure: type=tuple str=(({'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')},), {'training': False})

Second structure: type=tuple str=(([TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/1')],), {'training': False})

More specifically: The two namedtuples don't have the same sequence type. First structure type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')} has type dict, while second structure type=list str=[TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/1')] has type list

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_25463/2006687310.py in <module>
----> 1 restored_model = tf.keras.models.load_model("_tmp_model")

/anaconda/lib/python3.7/site-packages/keras/saving/save.py in load_model(filepath, custom_objects, compile, options)
    203         filepath = path_to_string(filepath)
    204         if isinstance(filepath, str):
--> 205           return saved_model_load.load(filepath, compile, options)
    206 
    207   raise IOError(

/anaconda/lib/python3.7/site-packages/keras/saving/saved_model/load.py in load(path, compile, options)
    141 
    142   # Finalize the loaded layers and remove the extra tracked dependencies.
--> 143   keras_loader.finalize_objects()
    144   keras_loader.del_tracking()
    145 

/anaconda/lib/python3.7/site-packages/keras/saving/saved_model/load.py in finalize_objects(self)
    638         layers_revived_from_config.append(node)
    639 
--> 640     _finalize_saved_model_layers(layers_revived_from_saved_model)
    641     _finalize_config_layers(layers_revived_from_config)
    642 

/anaconda/lib/python3.7/site-packages/keras/saving/saved_model/load.py in _finalize_saved_model_layers(layers)
    835           continue
    836         if call_fn.input_signature is None:
--> 837           args, kwargs = infer_inputs_from_restored_call_function(call_fn)
    838           args = list(args)
    839           inputs = args.pop(0)

/anaconda/lib/python3.7/site-packages/keras/saving/saved_model/load.py in infer_inputs_from_restored_call_function(fn)
   1172   for concrete in fn.concrete_functions[1:]:
   1173     spec2 = concrete.structured_input_signature
-> 1174     spec = tf.nest.map_structure(common_spec, spec, spec2)
   1175   return spec
   1176 

/anaconda/lib/python3.7/site-packages/tensorflow/python/util/nest.py in map_structure(func, *structure, **kwargs)
    861   for other in structure[1:]:
    862     assert_same_structure(structure[0], other, check_types=check_types,
--> 863                           expand_composites=expand_composites)
    864 
    865   flat_structure = (flatten(s, expand_composites) for s in structure)

/anaconda/lib/python3.7/site-packages/tensorflow/python/util/nest.py in assert_same_structure(nest1, nest2, check_types, expand_composites)
    533                   "Entire first structure:\n%s\n"
    534                   "Entire second structure:\n%s"
--> 535                   % (str(e), str1, str2))
    536 
    537 

TypeError: The two structures don't have the same nested structure.

First structure: type=tuple str=(({'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')},), {'training': False})

Second structure: type=tuple str=(([TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/1')],), {'training': False})

More specifically: The two namedtuples don't have the same sequence type. First structure type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')} has type dict, while second structure type=list str=[TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, 64), dtype=tf.int32, name='inputs/1')] has type list
Entire first structure:
(({'input_ids': .},), {'training': .})
Entire second structure:
(([., .],), {'training': .})
jvishnuvardhan commented 3 years ago

@Nash2325138 I am facing different error. Can you please check this gist and let us know what version you are using. Thanks!

eyalshafran commented 3 years ago

A solution is to call the roberta model slightly different: sequence_output,pooled_output = roberta_layer.roberta(roberta_inputs)

This fixes the problem

jvishnuvardhan commented 3 years ago

@Nash2325138 I am not familiar with this roberta_layer. As you suggested, I updated the above line. Now it is throwing another error. Here is a gist for reference. Thanks!

Nash2325138 commented 3 years ago

Sorry that I forgot to give my transformers version. It's 2.8.0, which is a bit old and might be an issue. The full screenshots: image image image

The input codes are:

import tensorflow as tf
import transformers
from transformers import TFRobertaModel

print(tf.__version__)
print(transformers.__version__)

roberta_layer = TFRobertaModel.from_pretrained('roberta-base')

ids = tf.keras.layers.Input((64,), dtype=tf.int32)
att = tf.keras.layers.Input((64,), dtype=tf.int32)

roberta_inputs = [ids, att]

sequence_output,pooled_output = roberta_layer(roberta_inputs)

# unigram
x1 = tf.keras.layers.Conv1D(32,1,activation='relu')(sequence_output)
x1 = tf.keras.layers.GlobalMaxPool1D()(x1)

# bigram
x2 = tf.keras.layers.Conv1D(32,2,activation='relu')(sequence_output)
x2 = tf.keras.layers.GlobalMaxPool1D()(x2)

# trigram
x3 = tf.keras.layers.Conv1D(32,3,activation='relu')(sequence_output)
x3 = tf.keras.layers.GlobalMaxPool1D()(x3)

concat = tf.keras.layers.Concatenate()([x1,x2,x3])
concat = tf.keras.layers.Dropout(0.5)(concat)

outputs = tf.keras.layers.Dense(11, activation='sigmoid')(concat)

model = tf.keras.Model(inputs=roberta_inputs, outputs=outputs)

model.save('_tmp_model')
restored_model = tf.keras.models.load_model("_tmp_model")
Nash2325138 commented 3 years ago

Update: if I change

sequence_output,pooled_output = roberta_layer(roberta_inputs)

to

sequence_output,pooled_output = roberta_layer.roberta(roberta_inputs)

as @eyalshafran suggested, the problem is magically solved. I wonder what's the cause

Nash2325138 commented 2 years ago

Hmm... Any follow up about this issue? Though @eyalshafran's suggestion solved this case. I still don't understand why the original one will fail

shreyas-jk commented 2 years ago

Even I am facing the same issue.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-3a1d333d606b> in <module>()
----> 1 ner_model_2 = pickle.load(open('ner_model_2.pkl', 'rb'))

2 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/nest.py in assert_same_structure(nest1, nest2, check_types, expand_composites)
    533                   "Entire first structure:\n%s\n"
    534                   "Entire second structure:\n%s"
--> 535                   % (str(e), str1, str2))
    536 
    537 

ValueError: The two structures don't have the same nested structure.

First structure: type=tuple str=(({'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}, None, None, None, None, None, None, None, None, None, None, None, None, False), {})

Second structure: type=tuple str=((TensorSpec(shape=(None, 384), dtype=tf.int32, name='input_ids'), TensorSpec(shape=(None, 384), dtype=tf.int32, name='attention_mask'), TensorSpec(shape=(None, 384), dtype=tf.int32, name='token_type_ids'), None, None, None, None, None, None, None, None, None, None, False), {})

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 384), dtype=tf.int32, name='input_ids')" is not
Entire first structure:
(({'input_ids': .}, ., ., ., ., ., ., ., ., ., ., ., ., .), {})
Entire second structure:
((., ., ., ., ., ., ., ., ., ., ., ., ., .), {})
shreyas-jk commented 2 years ago

Even I am facing the same issue.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-3a1d333d606b> in <module>()
----> 1 ner_model_2 = pickle.load(open('ner_model_2.pkl', 'rb'))

2 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/nest.py in assert_same_structure(nest1, nest2, check_types, expand_composites)
    533                   "Entire first structure:\n%s\n"
    534                   "Entire second structure:\n%s"
--> 535                   % (str(e), str1, str2))
    536 
    537 

ValueError: The two structures don't have the same nested structure.

First structure: type=tuple str=(({'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}, None, None, None, None, None, None, None, None, None, None, None, None, False), {})

Second structure: type=tuple str=((TensorSpec(shape=(None, 384), dtype=tf.int32, name='input_ids'), TensorSpec(shape=(None, 384), dtype=tf.int32, name='attention_mask'), TensorSpec(shape=(None, 384), dtype=tf.int32, name='token_type_ids'), None, None, None, None, None, None, None, None, None, None, False), {})

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 384), dtype=tf.int32, name='input_ids')" is not
Entire first structure:
(({'input_ids': .}, ., ., ., ., ., ., ., ., ., ., ., ., .), {})
Entire second structure:
((., ., ., ., ., ., ., ., ., ., ., ., ., .), {})

I have figured out a way to overcome this problem. Instead of saving the model using model.save or tf.keras.models.save_model(), try saving the model weights using model.save_weights('model_weight.h5') during the training process.

At the prediction or testing phase, you need to create the model (architecture) manually, the same way you did in the training phase, and then load the weights to it.

Allen-Qiu commented 2 years ago

Yes, as introduced by shreyas-jk, model.save_weights('model_weight.h5') can deal with this problem. My model contains a huggingface bert. It cannot be saved. However, I can save the weights. Subsequently, I build the model again in other files and load the weights.

dschwalm commented 2 years ago

Same issue. tf 2.7.0 transformers 4.14.1 pretrained model: bert-base-cased

yoonnoon commented 2 years ago

tensorflow 2.7.0 keras 2.7.0 transformers 4.15.0 pretrained model: "bert-base-multilingual-cased"

I also had the same issue when using model.save() & keras.model.load_model(). but, I change my python code to model.save_weights() & model.load_weights(), this issue resolved. thanks to @Allen-Qiu & @shreyas-jk

christophmeyer commented 2 years ago

In case this is still relevant for people, here is what I think is going on:

Keras saves the input specs on the first call of the model here. When loading a pretrained model with transformers using the from_pretrained class classmethod of TFPretrainedModel, the networks is first fed dummy inputs here. So the saved models expect their input tensors to be of sequence length 5, because that is the length of the dummy inputs.

In order to change that behaviour, you can reset the input specs before saving to a saved model like this:

# This is just a toy example of an input with sequence length 8 (as opposed to 5), just use a dictionary of your actual features instead. 
features = {"input_ids": tf.constant([[1,2,3,4,5,6,7,8]]), "attention_mask": tf.constant([[1,1,1,1,1,1,1,1]]), "token_type_ids": tf.constant([[0,0,0,0,0,0,0,0]])}

model._saved_model_inputs_spec = None
model._set_save_spec(features)

tf.saved_model.save(model, "./out_dir")
gowthamkpr commented 2 years ago

@eyalshafran @yoonnoon @dschwalm @Allen-Qiu @shreyas-jk Can you please try the above proposed solution and let me know if it works. Thanks!

YanlongLai commented 2 years ago

It is not working for me. The first structure is still fixed on 5


First structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

Second structure: type=dict str={'attention_mask': TensorSpec(shape=(None, 32), dtype=tf.int32, name='inputs/attention_mask'), 'input_ids': TensorSpec(shape=(None, 32), dtype=tf.int32, name='inputs/input_ids')}
google-ml-butler[bot] commented 2 years ago

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

Mohamed-Aziz-Ben-Nessir commented 2 years ago

tf.saved_model.load() instead of tf.keras.models.load_model() on Kaggle TPUs did the job for me.

gowthamkpr commented 2 years ago

@eyalshafran I tried to replicate this issue with transformers 4.11.3 and tensorflow 2.8.2 and running into a different error. Here's the gist. Also as mentioned above can you please use tf.saved_model.load() instead of tf.keras.models.load_model() .

If you think this is still a bug please create a different issue with working example and will look into it. Thanks!

google-ml-butler[bot] commented 2 years ago

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 2 years ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No

MrKZZ commented 2 years ago

Even I am facing the same issue.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-3a1d333d606b> in <module>()
----> 1 ner_model_2 = pickle.load(open('ner_model_2.pkl', 'rb'))

2 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/nest.py in assert_same_structure(nest1, nest2, check_types, expand_composites)
    533                   "Entire first structure:\n%s\n"
    534                   "Entire second structure:\n%s"
--> 535                   % (str(e), str1, str2))
    536 
    537 

ValueError: The two structures don't have the same nested structure.

First structure: type=tuple str=(({'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}, None, None, None, None, None, None, None, None, None, None, None, None, False), {})

Second structure: type=tuple str=((TensorSpec(shape=(None, 384), dtype=tf.int32, name='input_ids'), TensorSpec(shape=(None, 384), dtype=tf.int32, name='attention_mask'), TensorSpec(shape=(None, 384), dtype=tf.int32, name='token_type_ids'), None, None, None, None, None, None, None, None, None, None, False), {})

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 384), dtype=tf.int32, name='input_ids')" is not
Entire first structure:
(({'input_ids': .}, ., ., ., ., ., ., ., ., ., ., ., ., .), {})
Entire second structure:
((., ., ., ., ., ., ., ., ., ., ., ., ., .), {})

I have figured out a way to overcome this problem. Instead of saving the model using model.save or tf.keras.models.save_model(), try saving the model weights using model.save_weights('model_weight.h5') during the training process.

At the prediction or testing phase, you need to create the model (architecture) manually, the same way you did in the training phase, and then load the weights to it.

I try your mention, save_weights, create the model, and load_weights, but get this error:

ValueError: Unable to load weights saved in HDF5 format into a subclassed Model which has not created its variables yet. Call the Model first, then load the weights.

Similar, I build my model using transformers. In init function, I build a albert and load weight by from_pretrained()