AttributeError: 'RobertaConfig' object has no attribute 'attn_type'

Environment

Google Colab. Installed the '4.5.0.dev0' version of transformers by !pip install git+https://github.com/huggingface/transformers

Issues

Hi guys, I tried to fine-tune RoBERTa on WikiText-2 by following the commands shared in the examples/language-modeling section of the github page as follows:

python run_mlm.py \ --model_name_or_path roberta-base \ --dataset_name wikitext \ --dataset_config_name wikitext-2-raw-v1 \ --do_train \ --do_eval \ --output_dir /tmp/test-mlm

but I ran into and error AttributeError: 'RobertaConfig' object has no attribute 'attn_type'. Looks like it cannot find the config needed.

Please advise What Did I do wrong. Thanks!

To reproduce

python run_mlm.py \ --model_name_or_path roberta-base \ --dataset_name wikitext \ --dataset_config_name wikitext-2-raw-v1 \ --do_train \ --do_eval \ --output_dir /tmp/test-mlm

Error message I got:

`2021-03-24 08:51:51.464928: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 03/24/2021 08:51:52 - WARNING - main - Process rank: -1, device: cpu, n_gpu: 0distributed training: False, 16-bits training: False 03/24/2021 08:51:53 - INFO - main - Training/evaluation parameters TrainingArguments(output_dir=/tmp/test-mlm, overwrite_output_dir=False, do_train=True, do_eval=True, do_predict=False, evaluation_strategy=IntervalStrategy.NO, prediction_loss_only=False, per_device_train_batch_size=8, per_device_eval_batch_size=8, gradient_accumulation_steps=1, eval_accumulation_steps=None, learning_rate=5e-05, weight_decay=0.0, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, max_grad_norm=1.0, num_train_epochs=3.0, max_steps=-1, lr_scheduler_type=SchedulerType.LINEAR, warmup_ratio=0.0, warmup_steps=0, logging_dir=runs/Mar24_08-51-52_f7b8b5062dd4, logging_strategy=IntervalStrategy.STEPS, logging_first_step=False, logging_steps=500, save_strategy=IntervalStrategy.STEPS, save_steps=500, save_total_limit=None, no_cuda=False, seed=42, fp16=False, fp16_opt_level=O1, fp16_backend=auto, fp16_full_eval=False, local_rank=-1, tpu_num_cores=None, tpu_metrics_debug=False, debug=False, dataloader_drop_last=False, eval_steps=500, dataloader_num_workers=0, past_index=-1, run_name=/tmp/test-mlm, disable_tqdm=False, remove_unused_columns=True, label_names=None, load_best_model_at_end=False, metric_for_best_model=None, greater_is_better=None, ignore_data_skip=False, sharded_ddp=[], deepspeed=None, label_smoothing_factor=0.0, adafactor=False, group_by_length=False, report_to=['tensorboard'], ddp_find_unused_parameters=None, dataloader_pin_memory=True, skip_memory_metrics=False, _n_gpu=0) 03/24/2021 08:51:53 - WARNING - datasets.builder - Reusing dataset wikitext (/root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/47c57a6745aa5ce8e16a5355aaa4039e3aa90d1adad87cef1ad4e0f29e74ac91) [INFO|configuration_utils.py:472] 2021-03-24 08:51:53,301 >> loading configuration file https://huggingface.co/roberta-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/733bade19e5f0ce98e6531021dd5180994bb2f7b8bd7e80c7968805834ba351e.35205c6cfc956461d8515139f0f8dd5d207a2f336c0c3a83b4bc8dca3518e37b [INFO|configuration_utils.py:508] 2021-03-24 08:51:53,301 >> Model config RobertaConfig { "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.5.0.dev0", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50265 }

[INFO|configuration_utils.py:472] 2021-03-24 08:51:53,358 >> loading configuration file https://huggingface.co/roberta-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/733bade19e5f0ce98e6531021dd5180994bb2f7b8bd7e80c7968805834ba351e.35205c6cfc956461d8515139f0f8dd5d207a2f336c0c3a83b4bc8dca3518e37b [INFO|configuration_utils.py:508] 2021-03-24 08:51:53,359 >> Model config RobertaConfig { "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.5.0.dev0", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50265 }

[INFO|tokenization_utils_base.py:1702] 2021-03-24 08:51:53,706 >> loading file https://huggingface.co/roberta-base/resolve/main/vocab.json from cache at /root/.cache/huggingface/transformers/d3ccdbfeb9aaa747ef20432d4976c32ee3fa69663b379deb253ccfce2bb1fdc5.d67d6b367eb24ab43b08ad55e014cf254076934f71d832bbab9ad35644a375ab [INFO|tokenization_utils_base.py:1702] 2021-03-24 08:51:53,707 >> loading file https://huggingface.co/roberta-base/resolve/main/merges.txt from cache at /root/.cache/huggingface/transformers/cafdecc90fcab17011e12ac813dd574b4b3fea39da6dd817813efa010262ff3f.5d12962c5ee615a4c803841266e9c3be9a691a924f72d395d3a6c6c81157788b [INFO|tokenization_utils_base.py:1702] 2021-03-24 08:51:53,707 >> loading file https://huggingface.co/roberta-base/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/d53fc0fa09b8342651efd4073d75e19617b3e51287c2a535becda5808a8db287.fc9576039592f026ad76a1c231b89aee8668488c671dfbe6616bab2ed298d730 [INFO|tokenization_utils_base.py:1702] 2021-03-24 08:51:53,707 >> loading file https://huggingface.co/roberta-base/resolve/main/added_tokens.json from cache at None [INFO|tokenization_utils_base.py:1702] 2021-03-24 08:51:53,707 >> loading file https://huggingface.co/roberta-base/resolve/main/special_tokens_map.json from cache at None [INFO|tokenization_utils_base.py:1702] 2021-03-24 08:51:53,707 >> loading file https://huggingface.co/roberta-base/resolve/main/tokenizer_config.json from cache at None [INFO|modeling_utils.py:1051] 2021-03-24 08:51:53,860 >> loading weights file https://huggingface.co/roberta-base/resolve/main/pytorch_model.bin from cache at /root/.cache/huggingface/transformers/51ba668f7ff34e7cdfa9561e8361747738113878850a7d717dbc69de8683aaad.c7efaa30a0d80b2958b876969faa180e485944a849deee4ad482332de65365a7 Traceback (most recent call last): File "/content/drive/MyDrive/Colab Notebooks/run_mlm.py", line 461, in main() File "/content/drive/MyDrive/Colab Notebooks/run_mlm.py", line 306, in main use_auth_token=True if model_args.use_auth_token else None, File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1058, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/models/xlnet/modeling_xlnet.py", line 1309, in init self.attn_type = config.attn_type AttributeError: 'RobertaConfig' object has no attribute 'attn_type'`

huggingface / transformers

AttributeError: 'RobertaConfig' object has no attribute 'attn_type' #10882