Closed JKP0 closed 4 years ago
Hello JKP0,
What do you need the .ckpt files for?
@Poaz Dear, We are working on NLG models for coreference resolution. We started our project with BERT, so our implementations are dependent with the pre-trained BERT-model available from Google-API. Now we want to do study for the same with DistilBERT. Our implementation is based on TensorFlow 1.14.0
Actually our requirement is something like bellow
assignment_map, initialized_variable_names = modeling.get_assignment_map_from_checkpoint(tvars, config['tf_checkpoint']) # essential, unresolved
init_from_checkpoint = tf.train.init_from_checkpoint if config['init_checkpoint'].endswith('ckpt') else load_from_pytorch_checkpoint # essential, unresolved
model.get_all_encoder_layers() # this is our essential, right now completely unresolved for us
model.get_sequence_output() # this is our essential, right now completely unresolved for us
but any method (e.g. get_all_encoder_layers(); get_sequence_output(); get_assignment_map_from_checkpoint(); ...
) implemented in DistilBertModel
class to get this kind of thing is out-of my knowledge. I have checked a loat. In our earlier implementation, we have defined this method where we have used tf.train.list_variables(init_checkpoint)
and other tf-1 API to meet the need for which .ckpt files are essential.
And most of the tf-1 API uses checkpoint configuration (or serialized object), but we are unable to resolve it with the non-sequential .h5 model file by TFDistiBertModel. So we are in need to the same file for DistilBert which provided here for BERT.
If you or anyone can suggest a way to come out from it or possible convenient way to get .ckpt files for DistilBERT, I have lots of thanks in advance. Thanks!
Okay, thanks for the context. If you in anyway able to use PyTorch for your implementation you can get outputs from all layers using the following code:
from transformers import DistilBertTokenizer, DistilBertModel, DistilBertConfig
import torch
config = DistilBertConfig.from_pretrained('distilbert-base-uncased', output_hidden_states=True)
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
model = DistilBertModel.from_pretrained('distilbert-base-uncased', config=config)
model.eval()
input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True)).unsqueeze(0)
outputs = model(input_ids)
The output will then be outputs[0] (batch_size, seq_length, hidden_state) for the final layer and outputs[1] (batch_size, seq_length, hidden_state) for each layer in the model, with index 0 being the last layer.
If that is not an option, it is possible to convert the .h5 file to .ckpt using Keras and Tensorflow
For tf 1.x
saver = tf.train.Saver()
model = keras.models.load_model("model.h5")
sess = keras.backend.get_session()
save_path = saver.save(sess, "model.ckpt")
for tf 2.x
saver = tf.train.Checkpoint()
model = keras.models.load_model('model.hdf5', compile=False)
sess = tf.compat.v1.keras.backend.get_session()
save_path = saver.save('model.ckpt')
Hope it helps!
@Poaz your first idea is good, but it will cost us for other changes.
And second one giving error we have tried a lot, as DistilBERT model saved by model.save_pretrained('dir')
is not a sequential or serialized object and keras.models.load_model("model.h5")
only loads sequential and serialized .h5 model.
to save model
import tensorflow as tf from transformers import DistilBertTokenizer, TFDistilBertModel
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased') model = TFDistilBertModel.from_pretrained('distilbert-base-uncased') input_ids = tf.constant(tokenizer.encode("Hello, my dog is cute"), dtype="int32")[None, :] # Batch size 1 outputs = model(input_ids) last_hidden_states = outputs[0]
model.save_pretrained("./DSB/") model.save_weights("./DSB/DistDistilBERT_weights.h5")
> tf-1.14.0
import tensorflow as tf from keras.models import load_model
saver = tf.train.Saver() model = keras.models.load_model("DSB/tf_model.h5") sess = keras.backend.get_session() save_path = saver.save(sess, "/tmp/model.ckpt")
>
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-01f1268a6c60> in <module>()
----> 1 saver = tf.train.Saver()
2 model = load_model("DSB/tf_model.h5")
3 sess = keras.backend.get_session()
4 save_path = saver.save(sess, "model.ckpt")
2 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py in __init__(self, var_list, reshape, sharded, max_to_keep, keep_checkpoint_every_n_hours, name, restore_sequentially, saver_def, builder, defer_build, allow_empty, write_version, pad_step_number, save_relative_paths, filename)
823 time.time() + self._keep_checkpoint_every_n_hours * 3600)
824 elif not defer_build:
--> 825 self.build()
826 if self.saver_def:
827 self._check_saver_def()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py in build(self)
835 if context.executing_eagerly():
836 raise RuntimeError("Use save/restore instead of build in eager mode.")
--> 837 self._build(self._filename, build_save=True, build_restore=True)
838
839 def _build_eager(self, checkpoint_path, build_save, build_restore):
/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py in _build(self, checkpoint_path, build_save, build_restore)
860 return
861 else:
--> 862 raise ValueError("No variables to save")
863 self._is_empty = False
864
ValueError: No variables to save
> tf-2.0.0
import tensorflow as tf from tensorflow.keras.models import load_model
saver = tf.train.Checkpoint() model = load_model('DSB/tf_model.h5', compile=False) sess = tf.compat.v1.keras.backend.get_session() save_path = saver.save('model.ckpt')
>
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-13dd44da36a5> in <module>()
1 saver = tf.train.Checkpoint()
----> 2 model = load_model('DSB/tf_model.h5', compile=False)
3 sess = tf.compat.v1.keras.backend.get_session()
4 save_path = saver.save('model.ckpt')
1 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/saving/save.py in load_model(filepath, custom_objects, compile)
144 if (h5py is not None and (
145 isinstance(filepath, h5py.File) or h5py.is_hdf5(filepath))):
--> 146 return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
147
148 if isinstance(filepath, six.string_types):
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/saving/hdf5_format.py in load_model_from_hdf5(filepath, custom_objects, compile)
163 model_config = f.attrs.get('model_config')
164 if model_config is None:
--> 165 raise ValueError('No model found in config file.')
166 model_config = json.loads(model_config.decode('utf-8'))
167 model = model_config_lib.model_from_config(model_config,
ValueError: No model found in config file.
> tf-2.0.0
import tensorflow as tf from keras.models import load_model
saver = tf.train.Checkpoint() model = load_model('DSB/tf_model.h5', compile=False) sess = tf.compat.v1.keras.backend.get_session() save_path = saver.save('model.ckpt')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-15-13dd44da36a5> in <module>()
1 saver = tf.train.Checkpoint()
----> 2 model = load_model('DSB/tf_model.h5', compile=False)
3 sess = tf.compat.v1.keras.backend.get_session()
4 save_path = saver.save('model.ckpt')
3 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py in load_wrapper(*args, **kwargs)
456 os.remove(tmp_filepath)
457 return res
--> 458 return load_function(*args, **kwargs)
459
460 return load_wrapper
/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py in load_model(filepath, custom_objects, compile)
548 if H5Dict.is_supported_type(filepath):
549 with H5Dict(filepath, mode='r') as h5dict:
--> 550 model = _deserialize_model(h5dict, custom_objects, compile)
551 elif hasattr(filepath, 'write') and callable(filepath.write):
552 def load_function(h5file):
/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py in _deserialize_model(h5dict, custom_objects, compile)
237 return obj
238
--> 239 model_config = h5dict['model_config']
240 if model_config is None:
241 raise ValueError('No model found in config.')
/usr/local/lib/python3.6/dist-packages/keras/utils/io_utils.py in __getitem__(self, attr)
316 else:
317 if self.read_only:
--> 318 raise ValueError('Cannot create group in read-only mode.')
319 val = H5Dict(self.data.create_group(attr))
320 return val
ValueError: Cannot create group in read-only mode.
I see.. The h5 does not contain the model structure, therefore it can not be recreated. That means that it is necessary to rebuild the model in Keras for that method to work. That is simply not feasible for you I think.
hey,you can load the model as : loaded_model = TFDistilBertForSequenceClassification.from_pretrained("directory")
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@JKP0 were u able to solve the issue?
How did you solve this problem, can any one help in this. How to get .ckpt files for muril-base-cased/tf_model.h5
model.save_pretrained('dir')
tf_model.h5 how to get .ckpt files for it