Closed singhsidhukuldeep closed 3 years ago
You should first convert your checkpoint to a huggingface checkpoint, using the conversion script. You can check the docs here.
Hi @LysandreJik
Thank you so much for the response,
after training I will get a PyTorch checkpoint, right?
What is the procedure to get a tf
checkpoint?
You should first convert your checkpoint to a huggingface checkpoint, using the conversion script. You can check the docs here.
Hi @LysandreJik , I tried the above approach, and I converted it to a huggingface checkpoint.
Now when I run below command:
python run_mlm_wwm.py \
--model_name_or_path google-bert-tiny/pytorch_model.bin \
--config_name google-bert-tiny/bert_config.json \
--train_file train.txt \
--validation_file val.txt \
--do_train \
--do_eval \
--output_dir test-mlm-wwm \
--cache_dir cache
I am getting this error:
Traceback (most recent call last):
File "run_mlm_wwm.py", line 340, in <module>
main()
File "run_mlm_wwm.py", line 236, in main
tokenizer = AutoTokenizer.from_pretrained(
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/transformers/tokenization_auto.py", line 306, in from_pretrained
config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/transformers/configuration_auto.py", line 333, in from_pretrained
config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/transformers/configuration_utils.py", line 391, in get_config_dict
config_dict = cls._dict_from_json_file(resolved_config_file)
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/transformers/configuration_utils.py", line 474, in _dict_from_json_file
text = reader.read()
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte
@thomwolf
I believe the model_name_or_path
should point to a directory containing both the configuration and model files, with their appropriate name (config.json
, pytorch_model.bin
).
directory
- config.json
- pytorch_model.bin
Regarding your question to convert a model to a TensorFlow implementation, you can first convert your model to PyTorch and then load it in TensorFlow:
Let's say you saved the model in the directory directory
:
from transformers import TFBertForPreTraining
pt_model = BertForPreTraining.from_pretrained(directory, from_pt=True)
You can then save it as any other TensorFlow model.
Hi @LysandreJik
After giving the folder to config and model,
from transformers import convert_pytorch_checkpoint_to_tf2
convert_pytorch_checkpoint_to_tf2.convert_pt_checkpoint_to_tf(
model_type = "bert",
pytorch_checkpoint_path="model/",
config_file="model/config.json",
tf_dump_path="TFmodel",
compare_with_pt_model=False,
use_cached_models=False
)
I am getting this error:
Loading PyTorch weights from /home/3551351/bert-mlm/model
Traceback (most recent call last):
File "pt2tf.py", line 7, in <module>
convert_pytorch_checkpoint_to_tf2.convert_pt_checkpoint_to_tf(
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/transformers/convert_pytorch_checkpoint_to_tf2.py", line 283, in convert_pt_checkpoint_to_tf
tf_model = load_pytorch_checkpoint_in_tf2_model(tf_model, pytorch_checkpoint_path)
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/transformers/modeling_tf_pytorch_utils.py", line 93, in load_pytorch_checkpoint_in_tf2_model
pt_state_dict = torch.load(pt_path, map_location="cpu")
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/torch/serialization.py", line 581, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/3551351/.conda/envs/kuldeepVenv/lib/python3.8/site-packages/torch/serialization.py", line 211, in __init__
super(_open_file, self).__init__(open(name, mode))
IsADirectoryError: [Errno 21] Is a directory: '/home/3551351/bert-mlm/model'
I'm sorry, I think you misunderstood me. I was saying that about the way you launch your script, not the way you do the conversion:
python run_mlm_wwm.py \
--model_name_or_path google-bert-tiny \
--config_name google-bert-tiny \
--train_file train.txt \
--validation_file val.txt \
--do_train \
--do_eval \
--output_dir test-mlm-wwm \
--cache_dir cache
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.
If you think this still needs to be addressed please comment on this thread.
Environment info
transformers
version: 3.5.1Who can help
I think: @patrickvonplaten @LysandreJik @VictorSanh
Anyone is welcome!
Information
I am using
examples/language-modeling/run_mlm_wwm.py
to train my own Tiny BERT model.To reproduce
Using Tiny BERT from Google https://github.com/google-research/bert/blob/master/README.md Using
examples/language-modeling/run_mlm_wwm.py
from HuggingFace to train a language model on raw text.files in my
google-bert-tiny
arebert_config.json bert_model.ckpt.data-00000-of-00001 bert_model.ckpt.index vocab.txt
Steps to reproduce the behavior:
examples/language-modeling/run_mlm_wwm.py
from HuggingFace>Transformers LinkError:
Expected behavior
Want it to train