The repository of our EMNLP'20 paper
SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup
[paper] [slides]
Install the required packages:
pip install -r requirements.txt
data_dir
: specify the data file, we provide CoNLL-03 dataset here max_seq_length
: maximum length of each sequence num_train_epochs
: number of training epochs train_batch_size
: batch size during model training active_policy
: query policy of active learning augment_method
: augmenting method augment_rate
: augmenting rate hyper_alpha
: parameter of Beta distributionRandom Sampling
python active_learn.py --active_policy=random
Least Confidence Sampling
python active_learn.py --active_policy=lc
Normalized Token Entropy sampling
python active_learn.py --active_policy=nte
Whole sequence mixup
python active_learn.py --augment_method=soft
Sub-sequence mixup
python active_learn.py --augment_method=slack
Label-constrained sub-sequence mixup
python active_learn.py --augment_method=lf