Issues with run_glue.py script: Unrecognized arguments (do_contrastive_cls) and unable to parse test dataset

When running the script with the --do_contrastive_cls argument, I receive the following error:

Traceback (most recent call last):
  File "ContrastiveAA-main\examples\pytorch\text-classification\run_glue.py", line 626, in <module>
    main()
  File "ContrastiveAA-main\examples\pytorch\text-classification\run_glue.py", line 218, in main
    model_args, data_args, training_args = parser.parse_args_into_dataclasses()
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ahmad\anaconda3\envs\My_envmt\Lib\site-packages\transformers\hf_argparser.py", line 348, in parse_args_into_dataclasses
    raise ValueError(f"Some specified arguments are not used by the HfArgumentParser: {remaining_args}")
ValueError: Some specified arguments are not used by the HfArgumentParser: ['--do_contrastive_cls']

When running the script without the --do_contrastive_cls argument, I receive the following error:

C:\Users\ahmad\anaconda3\envs\My_envmt\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
[WARNING|modeling_utils.py:4282] 2024-07-31 14:53:17,081 >> Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "ContrastiveAA-main\examples\pytorch\text-classification\run_glue.py", line 626, in <module>
    main()
  File "ContrastiveAA-main\examples\pytorch\text-classification\run_glue.py", line 473, in main
    raise ValueError("--do_predict requires a test dataset")
ValueError: --do_predict requires a test dataset

The Command that I used is as follows:

python examples/pytorch/text-classification/run_glue.py \
    --model_name_or_path bert-base-uncased \
    --do_train \
    --do_eval \
    --num_train_epochs 5 \
    --gradient_accumulation_steps 4 \
    --test_file data/AA_data/AA_cls_test.json \
    --validation_file data/AA_data/AA_cls_validation.json \
    --train_file data/AA_data/AA_cls_train.json \
    --output_dir AA_region_cls/ \
    --overwrite_output_dir \
    --per_device_train_batch_size=128 \
    --per_device_eval_batch_size=32 \
    --save_strategy no \
    --evaluation_strategy epoch

I ensured that the paths to the dataset is correct, the train, test and valid file are all in the same folder. Also, one thing i noticed with the script you posted is that the file being used for validation as well as test is the same??

Hi,

We have checked and updated the command, please try the following:

python examples/pytorch/text-classification/run_glue.py \
    --model_name_or_path bert-base-uncased \
    --do_train \
    --do_predict \
    --num_train_epochs 5 \
    --gradient_accumulation_steps 4 \
    --test_file data/AA_data/AA_cls_test.json \
    --validation_file data/AA_data/AA_cls_val.json \
    --train_file data/AA_data/AA_cls_train.json \
    --output_dir AA_region_cls/ \
    --overwrite_output_dir \
    --per_device_train_batch_size=128 \
    --per_device_eval_batch_size=32 \
    --save_strategy no \
    --evaluation_strategy epoch

If you still get the "ValueError: --do_predict requires a test dataset", please use print(raw_datasets) before the error line to check if it includes train, validation, and test datasets.

Social-AI-Studio / ContrastiveAA

Issues with run_glue.py script: Unrecognized arguments (do_contrastive_cls) and unable to parse test dataset #1