XMUDM / SentiWSP

21 stars 4 forks source link

SentiWSP

For paper: Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis

Shuai Fan, Chen Lin, Haonan Li, Zhenghao Lin, Jinsong Su, Hang Zhang, Yeyun Gong, Jian Guo, Nan Duan

Xiamen University, The University of Melbourne, IDEA Research, Microsoft Research Asia

paper link: (https://arxiv.org/abs/2210.09803)

Dependencies

Quick Start for Fine-tunning

Our experiments contain sentence-level sentiment classification (e.g. SST-5 / MR / IMDB / Yelp-2 / Yelp-5) and aspect-level sentiment analysis (e.g. Lap14 / Res14).

Load our model(large)

You can download the pre-train model in (Google Drive), and load our model by :

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained(save_path)
model = AutoModelForSequenceClassification.from_pretrained(save_path)

You can also load our model in huggingface (https://huggingface.co/shuaifan/SentiWSP):

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("shuaifan/SentiWSP")
model = AutoModelForSequenceClassification.from_pretrained("shuaifan/SentiWSP")

Load our model(base)

You can also load our base model in huggingface (https://huggingface.co/shuaifan/SentiWSP-base):

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("shuaifan/SentiWSP-base")
model = AutoModelForSequenceClassification.from_pretrained("shuaifan/SentiWSP-base")

Download downstream dataset

You can download the downstream datasets from huggingface/datasets and find download code in SentiWSP_fine_tunning_SA.py. Meanwhile, we also put some downstream datasets in (Google Drive).

Fine-tunning

We show the example of fine-tuning SentiWSP on sentence-level sentiment classification IMDB as follows:

python  SentiWSP_fine_tunning_SA.py
    --dataset=imdb 
    --gpu_num=1 
    --loadmodel=True 
    --loadmodelpath=SentiWSP 
    --batch_size=8 
    --max_epoch=5 
    --model_size=large 
    --num_class=2

the example of fine-tuning SentiWSP on aspect-level sentiment analysis Lap14 as follows:

python  SentiWSP_fine_tunning_ASBA.py
    --dataset=laptop 
    --model_name=SentiWSP
    --batch_size=32
    --max_epoch=10 
    --max_len=128 
For SentiWSP and SentiWSP-base, We fine-tune 3-5 epochs for sentence-level sentiment classification tasks and 7-10 epochs for aspect-level sentiment classification tasks. We use learning rate=2e-5 for SA tasks and 1e-5 for ASBA tasks. We use different batch_size for different model size: model size batch_size max_sentence_length
base 32 512
large 8 512

Pre-training

If you want to conduct pre-training by yourself instead of directly using the checkpoint we provide, this part may help you pre-process the pre-training dataset and run the pre-training scripts. You should train the model on some NVIDIA Tesla A100 GPUs.

Word-level pre-training

python -m torch.distributed.launch 
    --nproc_per_node=4 
    --master_port=9999 
    SentiWSP_Pretrain_Word.py 
    --dataset=wiki 
    --size=large 
    --gpu_num=4 
    --save_pretrain_model=./word5_large_model/ 
    --max_len=128 
    --batch_size=64 
    --sentimask_prob=0.5

Sentence-level pre-training

  1. Warm-up
    python -m torch.distributed.launch 
    --nproc_per_node=4 
    --master_port=9999
    SentiWSP_Pretrain_Warmup_inbatch.py
    --load_model=word5_large_model
    --gpu_num=4
    --batch_size=32
    --max_len=128
    --save_model=./word_sen_model/ 
  2. Cross-batch
    • ANN Index Build:
      python SentiWSP_Pretrain_ANCE_GEN.py
      --gpu_num=1 
      --sentimask_prob=0.7 
      --max_length=128 
      --model_path=word_sen_model 
    • Train:
      python -m torch.distributed.launch 
      --nproc_per_node=4 
      --master_port=9999
      SentiWSP_Pretrain_ANCE_TRAIN.py
      --load_model=word_sen_model
      --gpu_num=4
      --batch_size=32
      --max_len=128
      --save_model=./word_sen_model_iter_1/ 

      You should iteratively run "ANN Index Build" and "Train" alternately and change the save_model name or Write a shell script to loop run "ANN Index Build" and "Train" steps.

Thanks

Many thanks to the GitHub repositories of Huggingface Transformers, our codes are based on their framework.