`TypeError` when setting the `adapter_config` argument for run_qa.py

Environment info

adapter-transformers version: 3.2.1
Platform: Linux-3.10.0-1160.80.1.el7.x86_64-x86_64-with-glibc2.17
Python version: 3.10.9
Huggingface_hub version: 0.15.1
PyTorch version (GPU?): 2.0.1+cu117 (True)
Tensorflow version (GPU?): not installed (NA)
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Information

Model I am using (Bert, XLNet ...): Roberta

Language I am using the model on (English, Chinese ...): English

Adapter setup I am using (if any): I tried using the pfeiffer+inv and pfeiffer configs ( as seen in case A & B below)

The problem arises when using:

[x] the official example scripts: (give details below)
[ ] my own modified scripts: (give details below)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[x] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

Case A

Use the same arguments from the documentation: run_mlm.py use case
Add remaining arguments needed for SQUAD2-based datasets before running run_qa.py like so:

#!/bin/sh
export TRAIN_FILE=./trial1_train.json
export VALIDATION_FILE=./trial1_dev.json
export MODELNAME=roberta-large
export OUTDIR=./model_outputs/

python run_qa.py  \
    --model_name_or_path $MODELNAME  \
    --train_file $TRAIN_FILE   \
    --validation_file $VALIDATION_FILE \
    --output_dir  $OUTDIR\
    --overwrite_output_dir \
    --version_2_with_negative \
    --do_train  \
    --do_eval \
    --per_device_train_batch_size 12   \
    --learning_rate 1e-4   \
    --num_train_epochs 10.0   \
    --doc_stride 128   \
    --max_seq_length 384   \
    --train_adapter \
    --adapter_config "pfeiffer+inv"

Running the above yields the following error: TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str, as can be seen in:

Traceback (most recent call last):
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 697, in <module>
    main()
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 616, in main
    setup_adapter_training(model, adapter_args, training_args, data_args.dataset_name or "squad")
  File "/mnt/home/eusojk/anaconda3/lib/python3.10/site-packages/transformers/adapters/training.py", line 60, in setup_adapter_training
    adapter_config = AdapterConfigBase.load(adapter_args.adapter_config, **adapter_config_kwargs)
TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str

Case B

In my attempts to fix the problem, I refered to this to set the --adapter_config argument to the path to a json file (called pfeiffer.json with the contents found at this configuration page )

So my pfeiffer.json looks like this:

{
    "ln_after": false,
    "ln_before": false,
    "mh_adapter": false,
    "output_adapter": true,
    "adapter_residual_before_ln": false,
    "non_linearity": null,
    "original_ln_after": true,
    "original_ln_before": true,
    "reduction_factor": null,
    "residual_before_ln": true
  }

And my bash script now looks like this:

#!/bin/sh
export TRAIN_FILE=./trial1_train.json
export VALIDATION_FILE=./trial1_dev.json
export MODELNAME=roberta-large
export OUTDIR=./model_outputs/
export CONFIG_FILE=./pfeiffer.json

python run_qa.py  \
    --model_name_or_path $MODELNAME  \
    --train_file $TRAIN_FILE   \
    --validation_file $VALIDATION_FILE \
    --output_dir  $OUTDIR\
    --overwrite_output_dir \
    --version_2_with_negative \
    --do_train  \
    --do_eval \
    --per_device_train_batch_size 12   \
    --learning_rate 1e-4   \
    --num_train_epochs 10.0   \
    --doc_stride 128   \
    --max_seq_length 384   \
    --train_adapter \
    --adapter_config $CONFIG_FILE

Running the above also yields the following error: TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str, as shown here:

Traceback (most recent call last):
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 697, in <module>
    main()
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 616, in main
    setup_adapter_training(model, adapter_args, training_args, data_args.dataset_name or "squad")
  File "/mnt/home/eusojk/anaconda3/lib/python3.10/site-packages/transformers/adapters/training.py", line 60, in setup_adapter_training
    adapter_config = AdapterConfigBase.load(adapter_args.adapter_config, **adapter_config_kwargs)
TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str

Expected behavior

I expected this model to run normally in adapter mode, withoud the need of full fine-tuning my model. That is, I expected the following results at the end of a successful completion of the script:

# all_results.json

{
    "epoch": 10.0,
    "eval_HasAns_exact": 80,
    "eval_HasAns_f1": 70,
    "eval_HasAns_total": 111,
    "eval_best_exact": 80,
    "eval_best_exact_thresh": 0.0,
    "eval_best_f1": 70,
    "eval_best_f1_thresh": 0.0,
    "eval_exact": 80,
    "eval_f1": 70,
    "eval_runtime": 2.6749,
    "eval_samples": 129,
    "eval_samples_per_second": 48.225,
    "eval_steps_per_second": 3.365,
    "eval_total": 111,
    "train_loss": 1.4477167261057886,
    "train_runtime": 746.8198,
    "train_samples": 1383,
    "train_samples_per_second": 18.519,
    "train_steps_per_second": 0.777
}

Complete Output


  0%|          | 0/2 [00:00<?, ?it/s]
100%|██████████| 2/2 [00:00<00:00, 653.93it/s]
[INFO|configuration_utils.py:666] 2023-07-05 10:58:50,330 >> loading configuration file config.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/config.json
[INFO|configuration_utils.py:718] 2023-07-05 10:58:50,337 >> Model config RobertaConfig {
  "_name_or_path": "roberta-large",
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "transformers_version": "4.26.1",
  "type_vocab_size": 1,
  "use_cache": true,
  "vocab_size": 50265
}

[INFO|tokenization_auto.py:458] 2023-07-05 10:58:50,379 >> Could not locate the tokenizer configuration file, will try to use the model config instead.
[INFO|configuration_utils.py:666] 2023-07-05 10:58:50,418 >> loading configuration file config.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/config.json
[INFO|configuration_utils.py:718] 2023-07-05 10:58:50,419 >> Model config RobertaConfig {
  "_name_or_path": "roberta-large",
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "transformers_version": "4.26.1",
  "type_vocab_size": 1,
  "use_cache": true,
  "vocab_size": 50265
}

[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,512 >> loading file vocab.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/vocab.json
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,513 >> loading file merges.txt from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/merges.txt
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,513 >> loading file tokenizer.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/tokenizer.json
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,514 >> loading file added_tokens.json from cache at None
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,514 >> loading file special_tokens_map.json from cache at None
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,515 >> loading file tokenizer_config.json from cache at None
[INFO|configuration_utils.py:666] 2023-07-05 10:58:50,515 >> loading configuration file config.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/config.json
[INFO|configuration_utils.py:718] 2023-07-05 10:58:50,517 >> Model config RobertaConfig {
  "_name_or_path": "roberta-large",
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "transformers_version": "4.26.1",
  "type_vocab_size": 1,
  "use_cache": true,
  "vocab_size": 50265
}

[INFO|modeling_utils.py:2275] 2023-07-05 10:58:50,642 >> loading weights file pytorch_model.bin from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/pytorch_model.bin
[WARNING|modeling_utils.py:2850] 2023-07-05 10:58:53,473 >> Some weights of the model checkpoint at roberta-large were not used when initializing RobertaForQuestionAnswering: ['lm_head.dense.weight', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'lm_head.bias']
- This IS expected if you are initializing RobertaForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[WARNING|modeling_utils.py:2862] 2023-07-05 10:58:53,474 >> Some weights of RobertaForQuestionAnswering were not initialized from the model checkpoint at roberta-large and are newly initialized: ['qa_outputs.weight', 'qa_outputs.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa.py", line 697, in <module>
    main()
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa.py", line 616, in main
    setup_adapter_training(model, adapter_args, training_args, data_args.dataset_name or "squad")
  File "/mnt/home/eusojk/anaconda3/lib/python3.10/site-packages/transformers/adapters/training.py", line 60, in setup_adapter_training
    adapter_config = AdapterConfigBase.load(adapter_args.adapter_config, **adapter_config_kwargs)
TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str

Thanks so much in advance!

adapter-hub / adapters