Jason-J-Choi / DeBERTa_TxtClassifier

Adversarial system using DeBERTa as base model, fine-tuned for text classification, and identifying robustness via TextFooler
7 stars 2 forks source link

TextFooler on DeBERTa

This repository is a purely academic implementation in determining the robustness of DeBERTa in its ability to perform text classification from the TextFooler adversarial generator.

Additionally modified TextFooler adversarial generator to consider for DeBERTa's disentangled attention by switching location of pairs of words and applying pair-wise synonym substition with adherence to original TextFooler's semantics requirements.

Credits of original development of DeBERTa and TextFooler goes to their respective authors

Attributes

Introduction to DeBERTa

DeBERTa (Decoding-enhanced BERT with disentangled attention) improves the BERT and RoBERTa models using two novel techniques. The first is the disentangled attention mechanism, where each word is represented using two vectors that encode its content and position, respectively, and the attention weights among words are computed using disentangled matrices on their contents and relative positions. Second, an enhanced mask decoder is used to replace the output softmax layer to predict the masked tokens for model pretraining. We show that these two techniques significantly improve the efficiency of model pre-training and performance of downstream tasks. Additional documentation here Specifics of installing DeBERTa can be found directly at the Github page

Introducing TextFooler

A Model for Natural Language Attack on Text Classification and Inference

Setup

*Install the TextFooler supporting system ESIM system.

cd ESIM
python setup.py install
cd ..

Run DeBERTa experiments from command line

For glue tasks,

  1. Get the data

    cache_dir=/tmp/DeBERTa/
    cd experiments/glue
    ./download_data.sh  $cache_dir/glue_tasks
  2. Download models

    python download_model.py
    # it will download the base model

2.1 Train if needed

python train.py \
    --dataset_path #location of dataset to train with \\
    --target_model #location of model downloaded \\

Additional parameters of nclasses, target_model_path, learning_rate, and num_epochs. If target_model_path is not specified, will automatically download from Hugging Face.

  1. Run task Look at original TextFooler code for more details
    python textfooler_attack.py \
    --dataset_path #location of dataset to validate \\
    --config_path #location of config.json file downloaded from 2. \\
    --target_model #location of model \\
    --target_model_type #base or xxlarge-v2. default base \\

Additional Input for TextFooler

python comp_cos_sim_mat.py [PATH_TO_COUNTER_FITTING_WORD_EMBEDDINGS]
python attack_classification.py

For Natural langauge inference:

python attack_nli.py

Examples of run code for these two files are in run_attack_classification.py and run_attack_nli.py. Here we explain each required argument in details:

Two more things to share with you:

  1. In case someone wants to replicate our experiments for training the target models, we shared the used seven datasets we have processed for you!

  2. In case someone may want to use our generated adversary results towards the benchmark data directly, here it is.

Citation

DeBERTa and TextFooler respectively

@misc{he2020deberta,
    title={DeBERTa: Decoding-enhanced BERT with Disentangled Attention},
    author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen},
    year={2020},
    eprint={2006.03654},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

@article{jin2019bert,
  title={Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment},
  author={Jin, Di and Jin, Zhijing and Zhou, Joey Tianyi and Szolovits, Peter},
  journal={arXiv preprint arXiv:1907.11932},
  year={2019}
}