adapter-hub / adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning
https://docs.adapterhub.ml
Apache License 2.0
2.6k stars 348 forks source link

`TypeError` when setting the `adapter_config` argument for run_qa.py #564

Closed eusojk closed 11 months ago

eusojk commented 1 year ago

Environment info

Information

Model I am using (Bert, XLNet ...): Roberta

Language I am using the model on (English, Chinese ...): English

Adapter setup I am using (if any): I tried using the pfeiffer+inv and pfeiffer configs ( as seen in case A & B below)

The problem arises when using:

The tasks I am working on is:

To reproduce

Steps to reproduce the behavior:

Case A

  1. Use the same arguments from the documentation: run_mlm.py use case
  2. Add remaining arguments needed for SQUAD2-based datasets before running run_qa.py like so:
#!/bin/sh
export TRAIN_FILE=./trial1_train.json
export VALIDATION_FILE=./trial1_dev.json
export MODELNAME=roberta-large
export OUTDIR=./model_outputs/

python run_qa.py  \
    --model_name_or_path $MODELNAME  \
    --train_file $TRAIN_FILE   \
    --validation_file $VALIDATION_FILE \
    --output_dir  $OUTDIR\
    --overwrite_output_dir \
    --version_2_with_negative \
    --do_train  \
    --do_eval \
    --per_device_train_batch_size 12   \
    --learning_rate 1e-4   \
    --num_train_epochs 10.0   \
    --doc_stride 128   \
    --max_seq_length 384   \
    --train_adapter \
    --adapter_config "pfeiffer+inv"

Running the above yields the following error: TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str, as can be seen in:

Traceback (most recent call last):
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 697, in <module>
    main()
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 616, in main
    setup_adapter_training(model, adapter_args, training_args, data_args.dataset_name or "squad")
  File "/mnt/home/eusojk/anaconda3/lib/python3.10/site-packages/transformers/adapters/training.py", line 60, in setup_adapter_training
    adapter_config = AdapterConfigBase.load(adapter_args.adapter_config, **adapter_config_kwargs)
TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str

Case B

In my attempts to fix the problem, I refered to this to set the --adapter_config argument to the path to a json file (called pfeiffer.json with the contents found at this configuration page )

So my pfeiffer.json looks like this:

{
    "ln_after": false,
    "ln_before": false,
    "mh_adapter": false,
    "output_adapter": true,
    "adapter_residual_before_ln": false,
    "non_linearity": null,
    "original_ln_after": true,
    "original_ln_before": true,
    "reduction_factor": null,
    "residual_before_ln": true
  }

And my bash script now looks like this:

#!/bin/sh
export TRAIN_FILE=./trial1_train.json
export VALIDATION_FILE=./trial1_dev.json
export MODELNAME=roberta-large
export OUTDIR=./model_outputs/
export CONFIG_FILE=./pfeiffer.json

python run_qa.py  \
    --model_name_or_path $MODELNAME  \
    --train_file $TRAIN_FILE   \
    --validation_file $VALIDATION_FILE \
    --output_dir  $OUTDIR\
    --overwrite_output_dir \
    --version_2_with_negative \
    --do_train  \
    --do_eval \
    --per_device_train_batch_size 12   \
    --learning_rate 1e-4   \
    --num_train_epochs 10.0   \
    --doc_stride 128   \
    --max_seq_length 384   \
    --train_adapter \
    --adapter_config $CONFIG_FILE

Running the above also yields the following error: TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str, as shown here:

Traceback (most recent call last):
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 697, in <module>
    main()
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa_ag.py", line 616, in main
    setup_adapter_training(model, adapter_args, training_args, data_args.dataset_name or "squad")
  File "/mnt/home/eusojk/anaconda3/lib/python3.10/site-packages/transformers/adapters/training.py", line 60, in setup_adapter_training
    adapter_config = AdapterConfigBase.load(adapter_args.adapter_config, **adapter_config_kwargs)
TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str

Expected behavior

I expected this model to run normally in adapter mode, withoud the need of full fine-tuning my model. That is, I expected the following results at the end of a successful completion of the script:

# all_results.json

{
    "epoch": 10.0,
    "eval_HasAns_exact": 80,
    "eval_HasAns_f1": 70,
    "eval_HasAns_total": 111,
    "eval_best_exact": 80,
    "eval_best_exact_thresh": 0.0,
    "eval_best_f1": 70,
    "eval_best_f1_thresh": 0.0,
    "eval_exact": 80,
    "eval_f1": 70,
    "eval_runtime": 2.6749,
    "eval_samples": 129,
    "eval_samples_per_second": 48.225,
    "eval_steps_per_second": 3.365,
    "eval_total": 111,
    "train_loss": 1.4477167261057886,
    "train_runtime": 746.8198,
    "train_samples": 1383,
    "train_samples_per_second": 18.519,
    "train_steps_per_second": 0.777
}

Complete Output


  0%|          | 0/2 [00:00<?, ?it/s]
100%|██████████| 2/2 [00:00<00:00, 653.93it/s]
[INFO|configuration_utils.py:666] 2023-07-05 10:58:50,330 >> loading configuration file config.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/config.json
[INFO|configuration_utils.py:718] 2023-07-05 10:58:50,337 >> Model config RobertaConfig {
  "_name_or_path": "roberta-large",
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "transformers_version": "4.26.1",
  "type_vocab_size": 1,
  "use_cache": true,
  "vocab_size": 50265
}

[INFO|tokenization_auto.py:458] 2023-07-05 10:58:50,379 >> Could not locate the tokenizer configuration file, will try to use the model config instead.
[INFO|configuration_utils.py:666] 2023-07-05 10:58:50,418 >> loading configuration file config.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/config.json
[INFO|configuration_utils.py:718] 2023-07-05 10:58:50,419 >> Model config RobertaConfig {
  "_name_or_path": "roberta-large",
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "transformers_version": "4.26.1",
  "type_vocab_size": 1,
  "use_cache": true,
  "vocab_size": 50265
}

[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,512 >> loading file vocab.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/vocab.json
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,513 >> loading file merges.txt from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/merges.txt
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,513 >> loading file tokenizer.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/tokenizer.json
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,514 >> loading file added_tokens.json from cache at None
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,514 >> loading file special_tokens_map.json from cache at None
[INFO|tokenization_utils_base.py:1802] 2023-07-05 10:58:50,515 >> loading file tokenizer_config.json from cache at None
[INFO|configuration_utils.py:666] 2023-07-05 10:58:50,515 >> loading configuration file config.json from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/config.json
[INFO|configuration_utils.py:718] 2023-07-05 10:58:50,517 >> Model config RobertaConfig {
  "_name_or_path": "roberta-large",
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "transformers_version": "4.26.1",
  "type_vocab_size": 1,
  "use_cache": true,
  "vocab_size": 50265
}

[INFO|modeling_utils.py:2275] 2023-07-05 10:58:50,642 >> loading weights file pytorch_model.bin from cache at /mnt/home/eusojk/.cache/huggingface/hub/models--roberta-large/snapshots/716877d372b884cad6d419d828bac6c85b3b18d9/pytorch_model.bin
[WARNING|modeling_utils.py:2850] 2023-07-05 10:58:53,473 >> Some weights of the model checkpoint at roberta-large were not used when initializing RobertaForQuestionAnswering: ['lm_head.dense.weight', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'lm_head.bias']
- This IS expected if you are initializing RobertaForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[WARNING|modeling_utils.py:2862] 2023-07-05 10:58:53,474 >> Some weights of RobertaForQuestionAnswering were not initialized from the model checkpoint at roberta-large and are newly initialized: ['qa_outputs.weight', 'qa_outputs.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa.py", line 697, in <module>
    main()
  File "/mnt/ufs18/home-064/eusojk/adapter-transformers/examples/pytorch/question-answering/run_qa.py", line 616, in main
    setup_adapter_training(model, adapter_args, training_args, data_args.dataset_name or "squad")
  File "/mnt/home/eusojk/anaconda3/lib/python3.10/site-packages/transformers/adapters/training.py", line 60, in setup_adapter_training
    adapter_config = AdapterConfigBase.load(adapter_args.adapter_config, **adapter_config_kwargs)
TypeError: transformers.adapters.configuration.AdapterConfigBase.load() argument after ** must be a mapping, not str

Thanks so much in advance!

eusojk commented 1 year ago

Hi there! I was wondering if there has been any potential advice on this? Thanks so much in advance!

calpt commented 11 months ago

Hey, sorry for not responding to this issue!

This bug should have been resolved with the release of the new Adapters library (v0.1.0). Please re-open and let us know if this is not the case. Thanks!