huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.79k stars 26.96k forks source link

In built code not able to download for "bert-base-uncased" when running on cluster. #8137

Closed Souravroych closed 3 years ago

Souravroych commented 4 years ago

Traceback (most recent call last): File "/users/sroychou/BERT_text_summarisation/scripts/train_bert_summarizer.py", line 12, in from metrics import optimizer, loss_function, label_smoothing, get_loss_and_accuracy, tf_write_summary, monitor_run File "/users/sroychou/BERT_textsummarisation/scripts/metrics.py", line 16, in , , = b_score(["I'm Batman"], ["I'm Spiderman"], lang='en', model_type='bert-base-uncased') File "/users/sroychou/.local/lib/python3.7/site-packages/bert_score/score.py", line 105, in score tokenizer = AutoTokenizer.from_pretrained(model_type) File "/users/sroychou/.local/lib/python3.7/site-packages/transformers/tokenization_auto.py", line 298, in from_pretrained config = AutoConfig.from_pretrained(pretrained_model_name_or_path, kwargs) File "/users/sroychou/.local/lib/python3.7/site-packages/transformers/configuration_auto.py", line 330, in from_pretrained configdict, = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs) File "/users/sroychou/.local/lib/python3.7/site-packages/transformers/configuration_utils.py", line 382, in get_config_dict raise EnvironmentError(msg) OSError: Can't load config for 'bert-base-uncased'. Make sure that:

LysandreJik commented 4 years ago

It seems that you have no internet access

Souravroych commented 4 years ago

Thank You. We also came to know that the cluster doesn't have internet access. I can manually download it and put that in a cache folder, if that is possible, can you please suggest where we can put this in a cache folder so that it could access from that place.

LysandreJik commented 4 years ago

You could put it in any folder and point to that folder instead! The from_pretrained method takes either an identifier to point to the S3 bucket, or a local path containing the required files.

The files must be named correctly, however (pytorch_model.bin for the PT model, tf_model.h5 for the TF model, and config.json for the configuration).

I guess the easiest for you would be to do something like the following:

1# Create the model cache

mkdir model_cache
cd model_cache
python

2# Download and save the models to the cache (here are two examples with BERT and RoBERTa)

# When doing this you must be careful that the architectures you're using contain all the trained layers that
# you will need in your task. Using the architectures with which they were pre-trained makes sure to contain
# all of these layers
from transformers import BertForPreTraining, BertTokenizer, RobertaForMaskedLM, RobertaTokenizer

BertForPreTraining.from_pretrained("bert-base-cased").save_pretrained("bert-cache")
BertTokenizer.from_pretrained("bert-base-cased").save_pretrained("bert-cache")

RobertaForMaskedLM.from_pretrained("roberta-base").save_pretrained("roberta-cache")
RobertaTokenizer.from_pretrained("roberta-base").save_pretrained("roberta-cache")

You can check that the folder now contains all the appropriate files:

ls -LR

# Outputs the following
./bert-cache:
config.json  pytorch_model.bin  special_tokens_map.json  tokenizer_config.json  vocab.txt

./roberta-cache:
config.json  merges.txt  pytorch_model.bin  special_tokens_map.json  tokenizer_config.json  vocab.json

You can then move your folder model_cache to your machine which has no internet access. Hope that helps.

Souravroych commented 4 years ago

Thanks a lot for the detailed explanation. I followed your steps and saved the checkpoints in model_cache and uncased_l12 (with same contents).However it is showing a keyerrror when it is referencing the model_cache folder

INFO:tensorflow:Extracting pretrained word embeddings weights from BERT 2020-10-30 14:37:43.909781: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 Some layers from the model checkpoint at /users/sroychou/uncased_l12/ were not used when initializing TFBertModel: ['nspcls', 'mlmcls']

Is there something I am doing wrong ? Been stuck on this for sometime.

LysandreJik commented 4 years ago

Hmm well it seems that is an issue with bert_score? I don't know what is BERT_text_summarisation, I don't know what is the metrics script, and I do not know what is the bert_score package.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.